Learning Linear Algebra with ISETL

Learning Linear Algebrawith ISETL

Learning Linear Algebrawith ISETL

Kirk Weller Aaron MontgomeryUniversity of North Texas Central Washington University

Julie Clark Jim CottrillHollins University Illinois State University

Maria Trigueros Ilana ArnonInstituto Tecnologico Autonomo

de Mexico

Centre for Educational

Technology

Ed DubinskyRUMEC

Preliminary Version 3July 31, 2002

c© 2002 by Research in Undergraduate Mathematics Education CommunityAll rights reserved.

Preface

The authors wish to express thanks to Don Muench of St. John Fisher College(Rochester, NY) whose work with linear algebra and ISETL gave us the basisfor our work. His code was written at Gettysburg College in 1991 withstudents there, so we thank Jared Colflesh, Ben Papada, Julie Leese, andDave Riihimaki.

This work is a collaborative effort both in authorship and its conception.We acknowledge the assistance of the following members of RUMEC whohave worked with us at various stages of the project:

Broni Czarnocha David DeVries Clare HemenwayGeorge Litman Sergio Loch Rob MerkovskySteve Morics Asuman Oktac Vrunda PrabhuKeith Schwingendorf

Many students and faculty have used these materials and have helped usto refine their intent and presentation. Those members of RUMEC who haveimplemented some or all of these sections are Ilana Arnon, Julie Clark, SergioLoch, Steve Morics, Keith Schwingendorf, and Kirk Weller. Our specialthanks go to the brave faculty who are implementing this approach and itsmaterials from beyond RUMEC. They and their students will guide us intaking this preliminary version to its next level.

Robert Acar University of Puerto Rico-Mayaguez

Felix Almendra Arao Unidad Profesional Interdisplinaria en Ingenieriay Tecnologias Avanzadas, Mexico

vi

vii

Contents

Preface v

1 Functions and Structures 11.1 Introduction to ISETL . . . . . . . . . . . . . . . . . . . . . . 2

Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Getting Started . . . . . . . . . . . . . . . . . . . . . . 8Simple Objects and Operations . . . . . . . . . . . . . 10

Modular arithmetic. . . . . . . . . . . . . . . . . 10Variables. . . . . . . . . . . . . . . . . . . . . . 11Boolean. . . . . . . . . . . . . . . . . . . . . . . 11

Control Statements . . . . . . . . . . . . . . . . . . . . 12if statements. . . . . . . . . . . . . . . . . . . . 12for loops. . . . . . . . . . . . . . . . . . . . . . 13

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.2 Structures and Operations . . . . . . . . . . . . . . . . . . . . 19


Tuples . . . . . . . . . . . . . . . . . . . . . . . . . . . 23Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24Tuple and Set Formers . . . . . . . . . . . . . . . . . . 25Set Operations . . . . . . . . . . . . . . . . . . . . . . 26Tuple and Set Operations . . . . . . . . . . . . . . . . 27Sets of Tuples . . . . . . . . . . . . . . . . . . . . . . . 27Quantification . . . . . . . . . . . . . . . . . . . . . . . 28Modular Arithmetic . . . . . . . . . . . . . . . . . . . 29

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301.3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

viii


Funcs and Their Syntax Options . . . . . . . . . . . . 42Funcs for Binary Operations . . . . . . . . . . . . . . . 43Funcs to Test Properties . . . . . . . . . . . . . . . . . 44Tuples and Smaps . . . . . . . . . . . . . . . . . . . . 45Procs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45The Fields Zp . . . . . . . . . . . . . . . . . . . . . . . 46Polynomials and Polynomial Functions . . . . . . . . . 47

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2 Vectors and Vector Spaces 512.1 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2.2 Introduction to Vector Spaces . . . . . . . . . . . . . . . . . . 60Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Finite Vector Spaces . . . . . . . . . . . . . . . . . . . 64Infinite Vector Spaces . . . . . . . . . . . . . . . . . . . 67Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 69Basic Properties of Vector Spaces . . . . . . . . . . . . 70name vector space . . . . . . . . . . . . . . . . . . . . 71

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722.3 Subspaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74


Determination of Subspaces . . . . . . . . . . . . . . . 76Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 78

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

3 First Look at Systems 813.1 Systems of Equations . . . . . . . . . . . . . . . . . . . . . . . 82


Algebraic Expressions and Linear Equations . . . . . . 86Forms of Solution Sets . . . . . . . . . . . . . . . . . . 89

ix

Systems of Linear Equations . . . . . . . . . . . . . . . 92Summarizing the Process for Finding the Solution of a

Systems of Equations . . . . . . . . . . . . . . 98Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

3.2 Solving Systems Using Augmented Matrices . . . . . . . . . . 109Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Using Augmented Matrices . . . . . . . . . . . . . . . . 113Summarizing the Process for Finding the Solution of

a System of Equations Using an AugmentedMatrix . . . . . . . . . . . . . . . . . . . . . . 120

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1213.3 A Geometric View of Systems . . . . . . . . . . . . . . . . . . 130


Equations in Two Unknowns . . . . . . . . . . . . . . . 137Equations in Three Unknowns . . . . . . . . . . . . . . 140Systems of Three Equations in Three Unknowns . . . . 142

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

4 Linearity and Span 1474.1 Linear Combinations . . . . . . . . . . . . . . . . . . . . . . . 148


The Difference Between a Set and a Sequence . . . . . 152Forming Linear Combinations . . . . . . . . . . . . . . 153Simplified Single-Vector Representations . . . . . . . . 154Geometric Representation . . . . . . . . . . . . . . . . 155Vectors Generated by a Set of Vectors—Span . . . . . 161What Vectors Can You Get from Linear Combinations? 162Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 163

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1644.2 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . 168


Definition of Linear Independent and Linear Dependent 172Geometric Interpretation/Generating Sets . . . . . . . 177Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 180

x

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1814.3 Generating Sets and Linear Independence . . . . . . . . . . . 185


Generating Sets and Their Spans . . . . . . . . . . . . 188Constructing Linearly Independent Generating Sets . . 190Properties of Linear Independence and Linear Depen-

dence . . . . . . . . . . . . . . . . . . . . . . 191Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 193

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1934.4 Bases and Dimension . . . . . . . . . . . . . . . . . . . . . . . 197


Summation Notation . . . . . . . . . . . . . . . . . . . 200Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202Expansion of a Vector with respect to a Basis. . . . . . 205

Representation of a vector space as Kn. . . . . . 205Finding a Basis . . . . . . . . . . . . . . . . . . . . . . 206

Finite dimensional vector spaces. . . . . . . . . 206Characterizations of bases. . . . . . . . . . . . . 207

Dimension . . . . . . . . . . . . . . . . . . . . . . . . . 207Dimensions of Euclidean spaces. . . . . . . . . . 210

Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 211Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

5 Linear Transformations 2175.1 Introduction to Linear Transformations . . . . . . . . . . . . . 218


Functions between Vector Spaces . . . . . . . . . . . . 222Definition and Significance of Linear Transformations . 223Component Functions and Linear Transformations . . . 230Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 232

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2335.2 Kernel and Range . . . . . . . . . . . . . . . . . . . . . . . . . 238


The Kernel of a Linear Transformation . . . . . . . . . 240

xi

The Image Space of a Linear Transformation . . . . . . 244Bases for the Kernel and Image Space . . . . . . . . . 245The General Form of a System of Linear Equations . . 252Non-Tuple Vector Spaces . . . . . . . . . . . . . . . . . 256

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2565.3 New Constructions from Old . . . . . . . . . . . . . . . . . . . 263


Scalar Multiple of a Linear Transformation . . . . . . . 268The Sum of Two Linear Transformations . . . . . . . . 270Equality of Linear Transformations . . . . . . . . . . . 271A Set of Linear Transformations as a Vector Space . . 271Creating New Linear Transformations . . . . . . . . . . 272Compositions of Linear Transformations . . . . . . . . 273

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274

6 Systems, Transformations and Matrices 2816.1 Vector Spaces of Matrices . . . . . . . . . . . . . . . . . . . . 282


Vector Spaces of Matrices . . . . . . . . . . . . . . . . 284Subspaces of Matrices . . . . . . . . . . . . . . . . . . 286Summation Notation . . . . . . . . . . . . . . . . . . . 287Dimensions of Matrix Vector Spaces . . . . . . . . . . . 288Linear Transformations of Matrices . . . . . . . . . . . 289

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2906.2 Transformations and Matrices . . . . . . . . . . . . . . . . . . 293


The Rank of a Matrix . . . . . . . . . . . . . . . . . . 298The Matrix of a Linear Transformation . . . . . . . . . 301Properties of Matrix Representations . . . . . . . . . . 303Retrospection . . . . . . . . . . . . . . . . . . . . . . . 305

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3056.3 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . 311


Matrix Multiplication . . . . . . . . . . . . . . . . . . . 315

xii

Multiplication as Composition . . . . . . . . . . . . . . 318Invertible Matrices and Change of Bases . . . . . . . . 320

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3256.4 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 331

Activities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

7 Getting to Second Bases 3357.1 Change of Basis . . . . . . . . . . . . . . . . . . . . . . . . . . 336


Coordinate Vectors . . . . . . . . . . . . . . . . . . . . 339Alias and alibi. . . . . . . . . . . . . . . . . . . 343

Matrix Representations . . . . . . . . . . . . . . . . . . 344Matrices with Special Forms . . . . . . . . . . . . . . . 347

Triangular matrices. . . . . . . . . . . . . . . . . 348Diagonal matrices. . . . . . . . . . . . . . . . . 349

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3527.2 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . . . . 357


Basic Ideas . . . . . . . . . . . . . . . . . . . . . . . . 358Bases of Eigenvectors . . . . . . . . . . . . . . . . . . . 360What Can Happen? . . . . . . . . . . . . . . . . . . . 364

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3657.3 Diagonalization and Applications . . . . . . . . . . . . . . . . 370


Relationship between Diagonalizability and Eigenvalues 373Conditions that Guarantee Diagonalizability . . . . . . 375A Procedure Diagonalizing a Transformation . . . . . . 379Using Diagonalization to Solve a System of Differential

Equations . . . . . . . . . . . . . . . . . . . . 381Markov Chains . . . . . . . . . . . . . . . . . . . . . . 384

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386

Chapter 1

Functions and Structures

ISETL is a Mathematical Programing Language.Before you run away, the emphasis is on theMathematical aspects. ISETL is a tool forconstructing mathematics using a computer.However it is a language for programing thecomputer, so you will need to learn somecommands and syntax. This chapter assumes noprior knowledge of ISETL or of programing. Itgets you started in a gentle manner and sets youon your way to learn linear algebra. Rest assuredthat you will be learning plenty of linear algebrain this chapter, too.

2

1.1 Introduction to ISETL

Activities

1. Use the documentation provided for your computer to make sure thatyou can answer the following questions.

(a) How do you turn the computer on?

(b) How do you turn the computer off?

(c) How do you enter information? From the keyboard? A mouse?Disks?

(d) How do you move around the screen? With keys? A mouse?

(e) How do you make files, and how are they organized?

(f) How do you save, back-up or discard files?

2. Use the documentation provided for ISETL to make sure that you cananswer the following questions.

(a) How do you start an ISETL session?

(b) How do you end an ISETL session?

(c) How do you enter information to the system by typing directly?

(d) How do you transfer information from a file to the system?

(e) How do you make changes, correct errors, add or delete material?

(f) How do you save the work that you do in a session?

(g) How do you print from a file or windows?

3. Following is data that can appear on the screen when you are in ISETL.The code on a line beginning with a > or >> prompt must be entered byyou. (The symbols > or >> are ISETL prompts. You do not enter theprompt; ISETL will provide it for you.) The end of such a line indicatesthat you should press Return or Enter. The other lines are put onthe screen by ISETL.

Start ISETL and operate interactively to enter the appropriate linesand obtain the indicated responses (note that the number of decimalplaces may vary).

1.1 Introduction to ISETL 3

>

> $NUMBERS

>

> 7 + 18;

25;

> 13 * (-233.8);

-3039.400;

> 170

>> + 237 - 460

>> * 2

>> ;

-513;

> 3 + 2; 2 +

5;

>> 1;

3;

> 3 + $this is a comment

>> 2;

5;

> 9/3;

3.000;

> 9/4;

2.250;

> 9/0;

!Error: Divide by zero

> 3 ** 2;

9

> 3**(1.2);

3.737;

> 9**(1/2);

3.000;

> (-1)**(1/2);

OM;

>

> $VARIABLES

>

> x := 2;

> x;

4 CHAPTER 1. FUNCTIONS AND STRUCTURES

2;

> X;

OM;

> a := 1; b := 2; a := b; b := a;

> a; b;

2;

2;

>

> $BOOLEANS

>

> 6 = 2 * 3;

true;

> 5 >= 2* 3;

false;

> is_integer(3/2);

false;

> is_integer(9/3);

true;

> is_integer(4.00);

true;

> x := 2;

> x < 3 and x > 1;

true;

> x > 2 and x < 2;

false;

> x < 0 or x > 1;

true;

> x < 1 impl x > 3;

true;

> not true and false;

false;

> not (true and false);

true;

>

> $IF

>

> x := 2;

> if x > 2 then


>> write "x is larger than 2";

>> else

>> write "x is 2 or smaller";

>> end if;

x is 2 or smaller

> x := 4;

> if x > 2 then


>> elseif x > 3 then


>> end if;

x is larger than 2

> if x > 2 then

>> write "x is in (2, infinity)";


>> write "x is in (1, 2]";


>> write "x is in (0, 1]";

>> else

>> write "x is in (-infinity, 0]";

>> end if;

x is in (2, infinity)

> x := 1;

> if x > 2 then



>> write "x is in (1, 2]";


>> write "x is in (0, 1]";

>> else


>> end if;

x is in (0, 1]


> $FOR LOOPS

> S := {2..6};

> y := 1;

> for x in S do

>> y := x * y;

>> end for;

> y;

720;

> S := {1..3};

> a := 0;

> for x, y in S do

>> a := a + x + y;

>> end for;

> a;

36;

4. In parts (a)–(g), you will be asked to work with code involving modulararithmetic. Modular arithmetic will be used throughout this course.

(a) Following is a list of items for you to enter into ISETL. Beforeentering, guess and write down what the response in ISETL willbe. In any case where the response is different from what youpredicted, try to understand why.

> 5 mod 5;

> 7 mod 5;

> 7 mod 7;

> 3 mod 5;

> (2 + 3) mod 5;

> -2 mod 5;

(b) Do you understand the meaning of the operation mod? Write anexplanation of this operation.

(c) There are several ways to visualize the operation mod. We shallstart with one method of visualization, and present another laterin the chapter.

To draw numbers mod 7 (here 7 is the right operand) draw anumber line and mark 0 and consecutive multiples of 7 (both


negative and positive, on both sides of 0) as long as your pagepermits. Leave reasonable and (approximately) equal distancesbetween every two consecutive multiples. (Why equal distances?)

Choose any integer n (try first an integer which is not a multipleof 7), which is in the range of your number line.

Draw your integer approximately on the line. How did you knowwhere to locate n: Between which two multiples of 7? Nearer towhich of the two?

Shift your integer by multiples of 7 until it reaches within the in-terval [0, 7). The number corresponding to this location of yourshifted integer n is n mod 7. According to the drawing in Fig-

-21 -14 -7 0 7 14 21…

n = 206 = 20 mod 7

Figure 1.1: A number line for mod 7

ure 1.1, 20 mod 7 = 6. Check this answer in ISETL.

(d) Choose three more integers n1, n2, n3, among them a multiple of7 and a negative. Locate them also on the number line, shift, andfind ni mod 7. Check your answers in ISETL.

(e) Following is another list of items for you to enter into ISETL.Before entering, use number lines to predict and write down whatthe response in ISETL will be. In any case where the response isdifferent from what you predicted, try to understand why.

(f) > 21 mod 7;

> (3 + 4) mod 7;

> 3 = -4 mod 7

> (1 + 6) mod 7

> 1 = -6 mod 7

> (4 * 2) mod 7;

> (2 + 5) mod 6;

> (2 * 5) mod 6;

> 2/3 mod 6;

> 2 * 1 = (2 * 4) mod 6;


> 2 * 4 = (2 * 1) mod 6;

> (2 * 4) mod 6 = (2 * 1) mod 6;

(g) What did your number line look like for the last set of ISETL codegiven in (f)? Why?

5. Following again is a list of items for you to enter into ISETL. Beforeentering, predict and write down what the response in ISETL will be.In any case where the response is different from what you predicted,try to understand why.

It may be more convenient for you to work with a file and to copy itemsto the screen as you enter them. The specific instructions for doing thiswill vary with the system.

(a) > b := 10;

> b;

> b + 20;

> b := b - 4; b; B;

(b) > (2 /= 3) and ((5.2/3.1) > 0.9);

> (3 <= 3) impl (3 = 2 + 1);

> (3 <= 3) impl (not (3 = 2 + 1));

> (3 > 3) impl (3 = 2 + 1);

> (3 > 3) impl (not (3 = 2 + 1));

(c) > 7 mod 4; 11 mod 4; -1 mod 4;

> (23 + 17) mod 3;

(d) > a := O; b := 1; c := 2; d := 3;

> a := d; b := c; c := b; d := a;

> a; b; c; d;

(e) > is_integer(5); is_integer(-13);

> is_integer(6/4); is_integer(6/3);

Discussion

Getting Started

Interaction with ISETL through the execution window follows a simple pat-tern:


• ISETL provides you with a prompt (>);

• You type and edit a line, and end by pressing Return or Enter. (Onceyou type Return or Enter, you cannot edit the line any further. Ifyou attempt to do so, ISETL will not recognize the changes.);

• ISETL reads the line and attempts to execute any complete statementsthat it finds (complete statements end with semi-colons);

• If you have an incomplete statement and press Return or Enter,ISETL provides you with a double prompt (>>) with which to continueyour statement;

• If at the end of the line, you have produced something which cannotbe the start of a correct statement, ISETL returns some funny wordsand tosses out all of your input back to the last complete statement itexecuted;

• The cycle starts again.

Since statements may be long and you will lose all of the intermediatework if you make an error on a line, it is a good idea to first type the intendedstatements somewhere other than the execution window. ISETL allows youto open plain text windows for this purpose. The method for transferringcode from a plain text window to the execution window is dependent uponthe system.

Another reason to keep your work in a separate window is that ISETLdeletes the contents of the Execution Window when it gets too long. Infact, ISETL may not provide you with any warning until after it has donethis. In order to retain your work, you will need to make it a habit to savethe contents of your Execution Window at regular intervals. Furthermore,you will need to save those contents to a different file each time ISETL hastruncated the text (or you will overwrite the file containing the old text).Note that while it tosses out the contents of the execution window, it doesnot remove the effects of those contents. For example, if you set b to 3 earlyin the session and that line is discarded by ISETL, then b will remain 3 evenafter the window has been truncated.

Other than commands, there are two different types of instructions thatyou can give to ISETL: directives and comments. Directives adjust the ISETL


environment and follow a slightly different set of rules than commands. Di-rectives always start at the first ! on the line and continue until the end ofthe line. Everything on the line after the directive will be discarded. Com-ments are ignored by ISETL. They start at the first $ on the line and continueuntil the end of the line.

Simple Objects and Operations

The first simple object ISETL supports is the symbol OM which means thatthe result of the computation is undefined.

ISETL supports a number of different types of objects, as well as operatorswhich act on those objects. The most common object for you to manipulatein ISETL will probably be numbers. There are three different types of num-bers that ISETL deals with: integers, fractions, and decimals. The specialsymbol OM means that the object is undefined.

As you would expect, the symbols in ISETL for addition and subtractionare + and -, respectively. In ISETL, multiplication is represented with a*. Placing two numbers next to each other without the * is not supported(you will get an error message about a Bad Mapping or OM). Division is in-dicated by the slash symbol (/). ISETL also supports exponentiation usingthe exponential operator (**).

Modular arithmetic. One arithmetic operator which you may not havebeen introduced to before is the mod operator. In Activity 4, you were shownhow to use number lines to elaborate mod expressions. You may have realizedthat to elaborate an n mod k expression you need a number line with k-multiples. After locating n between the appropriate k-multiples, you shift nby multiples of k to the interval [0, k). n mod k equals the value of the newlocation of n following the shift into [0, k).

Is it possible to find n mod k on the number line without shifting n tothe interval [0, k)? Just by locating n between the appropriate k-multiples?Try out some examples of your own to answer this question. Use ISETL tocheck your answers.

You may have found out that mod corresponds to the distance betweenn and the nearest multiple of k to its left. Namely, mod reduces its leftoperand to the remainder upon division by its right operand: to find 20 mod 7you located 20 between 14 and 21. You can represent 20 as 20 = 2 · 7 + 6,hence 20 mod 7 = 6. This is the general interpretation of mod: n mod k = r


if and only if there exists an integer a and an integer r, 0 ≤ r < k, such thatn = a · k + r. Check both the graphical and arithmetic descriptions withexamples of your own. Use ISETL to verify your calculations.

Variables. In Activity 5, you worked on the code:

> a := O; b := 1; c := 2; d := 3;

> a := d; b := c; c := b; d := a;

> a; b; c; d;

Were you surprised by the result of this code? Do you think that this result iswhat the programmer had in mind? Can you guess what was intended? Howwould you make it right? ISETL objects may be named either literally (e.g.,using the symbol 22 to refer to the number twenty-two) or using variables.A variable is a sequence of letters, digits, underscores ( ), carets (^) andprimes (′) which begins with a letter and is not a reserved word in ISETL.Upper and lower case letters are considered different when determining avariable’s value and so drat and DrAt refer to different objects. Any objectcan be assigned (or bound) to a variable and this is done in ISETL with theassignment operator (:=). Once this is done, the value of the variable willequal the value of the object. The value of the variable is determined at thetime the variable is assigned and will remain until the variable is explicitlyreassigned.

Boolean. ISETL also supports boolean values (true and false). Onemeans of generating a boolean value (true and false) is the use of com-parison operators on numbers. These are summarized in the table below:

operator meaning= equality/= inequality< strictly less than<= less than or equal> strictly greater than>= greater than or equal

Boolean values also have their own operators: and, or, not and impl. Thefirst three are fairly clear in meaning. The last one, impl, requires someexplanation. The implication operator is false only when a true statement is


said to imply a false statement. Based on this, a false statement is said toimply either a true statement or a false statement. The other feature of theimpl operator is that it completely evaluates both expressions before testingthe implication. This means that a statement like

x < 1 impl x > 3

is only tested for the particular value of x at the time of the test. In thiscase, if x were the value 2, then the truth value of the statement above wouldbe true. As a result, the ISETL statement does not have the same meaningas the statement “if x < 1 then x > 3,” where x is implicitly assumed to beany possible number. The impl operator is most useful where the variableranges over all possible values of a particular set. The code to do this willbe discussed in the next section.

Control Statements

Control statements allow you to direct activities in ISETL. Here we will learnto work with if and for statements.

if statements. An if statement consists of a sequence of branches. Eachbranch consists of a condition and a statement block. ISETL will test eachcondition until it finds one which is true, and it will execute the statementblock associated with that condition. The first branch is indicated with anif condition then construction, and later branches are indicated with anelseif condition then construction. A final branch can be indicated by anelse construction, and it will be executed if no other branch is executed.

The order of the branches may be important since only the block asso-ciated to the first matching (true) condition will be executed. This meansthat while the following is a valid statement in ISETL, it is unlikely that itdoes what the writer intended:

> if x > 2 then




>> end if;

A more complex example was given in Activity 3:


> if x > 2 then



>> write "x is in (1, 2]";


>> write "x is in (0, 1]";

>> else


>> end if;

The general structure of an if statement code is:

> if (boolean expression) then

>> (statements)

>> elseif (boolean expression) then

>> (statements)

> end if;

for loops. A for loop is used to repeat execution of a list of statements afixed number of times. A for loop begins with the key word for followed byan iterator. An iterator is a domain specification of one or more variables ina tuple or set. (You will see how tuples and sets work in the next section.)In Activity 3 you elaborated the code

> S := {1..3};

> a := 0;

> for x, y in S do

>> a := a + x + y;

>> end for;

> a;

In this example, when the two variables are iterated over the set S, ISETLselects a value of x and then iterates over all of the values for y. Then a secondvalue for x is selected and ISETL iterates again over all values for y. This isrepeated for each value of x. One possible order of selection might produce


the following solution:

a = 0 + 1 + 1 = 2

a = 2 + 1 + 2 = 5

a = 5 + 1 + 3 = 9

a = 9 + 2 + 1 = 12

a = 12 + 2 + 2 = 16

a = 16 + 2 + 3 = 21

a = 21 + 3 + 1 = 25

a = 25 + 3 + 2 = 30

a = 30 + 3 + 3 = 36.

Hence the output of this code is 36. After the iterator, the for loop has thekeyword do, followed by a list of commands (the last example consists of butone command: a := a + x + y).

A code of a for loop structure is completed by the keyword end for.The general structure of a for loop code is:

> for (variables) in (tuple or set) do

>> (statements)

>> end for;

Exercises

1. In your own words, write out explanations for each of the followingterms. Note that anything here which is in typewriter font is con-sidered to be an ISETL keyword.

(a) prompt

(b) true

(c) om

(d) mod

(e) boolean

(f) if statement

(g) for loop


(h) input

(i) objects

(j) operations

(k) ;

(l) impl

2. Read the following code, and follow the instructions and/or answer thequestions listed after the code.

rp := om;

x := 12;

y := 18;

if is_integer(x) and is_integer(y) and x > 0 and y > 0

then rp := true;

for i in [2..min(x, y)] do

if (x mod i = 0) and (y mod i = 0) then

rp := false;

end;

end;

end;

x; y; rp;

(a) Run the code several times, with different initial values for x, y.

(b) In your own words, write out an explanation of what this codedoes. In particular, explain how the code gets data to work on,what it does with the data, and what is the meaning of the result.

(c) Place this code in an external file. Exit ISETL and then re-enterISETL to run this code without retyping it.

(d) What does it mean to say that this code tests its input?

(e) Suppose that you run this code and that the value of y you enteris always twice the value of x. Can you be sure of what the valueof rp will be? Why?

(f) Add a statement to the code that will display a meaningful an-nouncement about the result.


(g) List some relationships between the values of x and y for whichyou can always be sure of the value of rp at the end.

(h) Suppose that values a, b for x, y result in rp having the valuetrue. Suppose that this is still the case for values b, c for x, y.

What will happen if you give x, y the values a, c?

3. Look at each of the following sets of ISETL code. Predict what will bethe result if the code is entered. Then enter it and note if you wereright or wrong. In either case, explain why.

12 div 4; 12 div 5; 12 div -5; -12 div 5; -12 div -4;

12 div 0;

12 mod 4; 9 mod 4; 6 mod 4; 3 mod 4; -2 mod 4;

-4 mod 4; -7 mod 4;

2 = 3; (4 + 5) /= -123; (12 mod 4) >= (12 div 4);

even(2**14); odd(187965*45);

max(-27, 27); min(-27, max(27, -27));

abs(min(-10, 12) - max(-10, 12));

4. Write a for loop which adds up all the numbers between 1 and 100.

5. Find the value of x mod 6 for x = -7, -6, ... 6, 7.

6. Describe all possible integer values of a for which a mod 6 = 0.

7. Describe all possible integer values of b for which b mod 6 = 4.

8. The following is another visualization of the operator mod . Choose aninteger k between 5 and 20. Like a number line with k-multiples, thefollowing drawing will serve as a representation of mod k.

Draw a circle and mark on it k arbitrary points, in (approximately)equal distances. Write next to one of the points, outside the circle, thenumber 0. Continue labeling the other points, moving in a constantdirection (clockwise or counter-clockwise), writing each consecutive in-teger in its turn next to the following point. After k steps you willwrite the number k next to the point 0, behind it. An example of adrawing for mod k, with k = 7, is given in Figure 1.2.


0

1

2

34

5

6

7

Figure 1.2: A circular representation

Continue writing numbers for at least two tours around the circle. Canyou describe the numbers that accumulate behind a specific point?Could you continue the list of numbers behind a particular point with-out moving all the way around the circle? Could you write a mathe-matical (algebraic) description of the set of the numbers for a specificpoint?

Choose a point on your circle, and write an ISETL code which constructsthe set of the numbers of this point up to 100. Compare the output ofthis code with the numbers you have written in your drawing.

9. Look at each of the following sets of ISETL code. Predict what will bethe result if the code is entered. Then enter it and note if you wereright or wrong. In either case, explain why.

(a) n := 58; (n div 3) * 3 + n mod 3;

(b) is integer(-1020.0) and is integer(-1020);

(c) true impl true; true impl false; false impl true;

false impl false;

(d) (23 + 5) mod 7 = 0 impl 7 mod 7 = 1;

(e) (1 + 6) mod 7 = 0 impl (1 + 5) mod 6 = 0;

(f) (23 + 5) mod 7 = 1 impl -70 mod 7 = 1;

(g) (2 + 3) mod 5 = 0 and (2 * 3) mod 5 = 1;


10. Write ISETL code that will run through all of the integers from 1 to 50and, each time the integer is even, will write out its square.

11. Change your code in the previous problem so that instead of even inte-gers, it will write out the square each time the integer gives a remainderof 3 when divided by 7.

12. Use ISETL to determine the larger of the fractions 23

+ 89, 4

5+ 6

7Do you

see a pattern in the choice of the four fractions? Run several variationsof the pattern and see if you can find a general rule.

19

1.2 Structures and Operations

Activities

1. Following is a list of items for you to enter into ISETL. Before entering,guess and write down what the response of ISETL will be. In case theresponse is different from your prediction, try to understand why.

T1 := [0..19]; T1;

T2 := [0, 2..19]; T2;

T3 := [2, 8..21]; T3;

T4 := [3, -5, 1];

T5 :=[2**i + 1 : i in [0..4]]; T5;

T1(5); T2(5); T3(5);

T4(3); T4(1); T5(1);

T3(8);

#T1; #T2; #T3; #T4; #T5;

U:=[1, 2, T2, 3 < 2, [3.5, -100]];

U(7); #U; U(5); U(5)(2);

2 in U; false in U; -100 in U; -100 in U(5);

Z20 := {0..19};

T1; T1; T1; T1;

Z20; Z20; Z20; Z20;

T1(5); Z20(5);

E := [2, 1 > 2, [1, 2]]; E1 := [1 > 2, 2, [1, 2]];

E2 := [2, 1 > 2, [1, 2], 2];

E = E1; E = E2; E1 = E2;

F := {2, 1 > 2, [1, 2]}; F1 := {1 > 2, 2, [1, 2]};

F2 := {2, 1 > 2, [1, 2], 2};

F = F1; F = F2; F1 = F2;

N := {O, 1, {0, 1}}; R := [0, 1, {0, 1}];

N = R;


B := [1, 2, 1, 3, 1, 4];

C := {1, 2, 1, 3, 1, 4};

#B; #C; B; B; B; C; C; C;

K1 := {1, 3, 2} with 5;

#K1; K1;

L1 := [1, 3, 2] with 5;

#L1; L1;

K2 := {1, 3, 2} with 3;

#K2; K2; K2; K2;

L2 := [1, 3, 2] with 3;

#L2; L2; L2; L2;

2. Write a few paragraphs describing your experience with Activity 1.What did you predict? What happened? How do you explain whatISETL did? Rather than just reporting events in chronological order,try to organize your description in some logical order and suggest gener-alizations. Include a description of the main differences between using[..] (which constructs a tuple or sequence) and {..} (which constructsa set).

3. Write out a verbal explanation of the result of giving each of the fol-lowing input lines to ISETL.

p := [1, 1, 0]; q := [1, 0, 1];

r := [(p(i) + q(i)) mod 2 : i in [1..3]]; r;

s := [1, 2, 0];

s1 := [(3 * s(i)) mod 5 : i in [1..3]]; s1;

G := {[a, b, c] : a, b, c in [0, 2, 4]};

H := {[a, b, c] : a, b, c in [1, 3]};

K := {[a, b, c] : a, b, c in [1..3]};

H union K; H union G; K union G;

K inter H; H inter G;

H subset K; G subset H;

Z20 := {0..19};

1.2 Structures and Operations 21

L := {g * h : g, h in Z20 | even(g) and h < 10};

L1 := {g * h : g, h in Z20 | even(g)};

L1 subset L; L subset L1;

Z20 - {0}; 0 in Z20; 0 in Z20 - {0};

S := pow({0, 1, 2, 3}); {0, 1} in S; {} in S;

arb(Z20); arb(Z20); arb(Z20); arb(Z20);

%+[1..10]; %*[l..6]; %or[2=1, 2=2, 2=3, 2=4];

%+{1..10}; %*{l..6}; %or{2=1, 2=2, 2=3, 2=4};

4. (a) Let Z2 3 be the set of all the 3-tuples (tuples with 3 elements)with elements in {1, 2}. How many elements are there in Z2 3?

(b) Write ISETL code that constructs Z2 3, and use it to check yourconjecture.


Z20 := {0..19};

Z2_3 := {[a,b,c] : a,b,c in [0,1]}; Z2_3;

forall x in Z20 | (x + 0) mod 20 = x;

forall x in Z20 | (x + 3) mod 20 = x;

exists p in Z2_3 | p(l) < p(2);

exists p in Z2_3 | p(l) = p(2);

exists e in Z20 | (forall g in Z20 | (e + g) mod 20 = g);

forall g in Z20 | (exists g’ in Z20 |

(g + g’) mod 20 = 0);

forall p, q in Z2_3 |

[(p(i) + q(i)) mod 2 : i in [1..3]] in Z2_3;

choose e in Z20 | (forall g in Z20 | (e + g) mod 20 = g);

e := choose x in Z20 |

(forall g in Z20 | (x + g) mod 20 = g); e;



Z5 := {a mod 5 : a in [-30..50]};

A := {a mod 5 : a in [-100..100]};

#Z5; #A; A = Z5;

C := {c : c in Z5 | (exists d in Z5 | (c + d) mod 5 = 0)};

C;

G := {g : g in Z5 | (exists d in Z5 | (g * d) mod 5 = 1)};

G;

C = Z5; G = Z5; G = Z5 - {0}; #G;

forall a in Z5 | (exists d in Z5 | (a + d) mod 5=0);

forall a in Z5 | (exists d in Z5 | (a * d) mod 5=1);

Z6 := {a mod 6 : a in [-100..100]}; Z6; #Z6;

M := {m : m in Z6 | (exists d in Z6 | (m + d) mod 6 = 0)};

M;

N := {n : n in Z6 | (exists d in Z6 | (n * d) mod 6 = 1)};

N;

M = Z6; N = Z6; N = Z6 - {0};

forall a in Z6 | (exists d in Z6 | (a + d) mod 6 = 0);

forall a in Z6 | (exists d in Z6 | (a * d) mod 6 = 1);

Z7 := {g mod 7 : g in [-50..50]}; Z7;

K := {(5 * g) mod 7 : g in Z7}; K;

H := {(2 * g) mod 7 : g in Z7}; H;

Z5 := {g mod 5 : g in [-50..50]}; Z5;

K1 := {(3 * g) mod 5 : g in Z5}; K1;

H1 := {(2 * g) mod 5 : g in Z5}; H1;

Z20 := {g mod 20 : g in [-50..50]}; Z20;

K2 := {(5 * g) mod 20 : g in Z20}; K2;

H2 := {(2 * g) mod 20 : g in Z20}; H2;

Z6 := {0..5};

K3 := {(5 * g) mod 6 : g in Z6}; K3;

H3 := {(4 * g) mod 6 : g in Z6}; H3;

7. Write ISETL code that will construct the following sets. Run your codeto check that it is correct.


(a) The set of all integers from 1 to 1000 whose squares mod 20 aregreater than 14.

(b) The set Z2 4 of all 4-tuples (tuples with four elements) with entriesfrom Z2.

(c) The set of all sums of the tuple p with the tuple q where p, q runthrough all elements of Z2 3.

(d) The set of all elements of the form [[x, y], (x + y) mod 6]

where x, y run through all the elements of Z6.

8. Write ISETL code that will test the truth or falsity of the followingstatements. Run your code to check that it is correct.

(a) Every element of Z20 is even.

(b) Every element of Z2 3 is a tuple.

(c) Some element of Z20 is a tuple.

(d) Some elements of Z20 are odd.

(e) The product mod 20 of every pair of elements of Z20 - {0} isagain in Z20 - {0}.

(f) Every element of Z20 has a corresponding element which whenadded to it mod 20 gives the result 0.

(g) There is an element of Z20 which when added to any element ofZ20 does not change it.

Discussion

Tuples

The ISETL object called tuple is used to represent a finite sequence. In Activ-ity 1, the code for T1 yields the sequence given by the first 19 whole numbers.Similarly, [-30..50] would yield the sequence of consecutive integers whosefirst term is −30 and whose last term is 50. In general, any tuple givenby [a..b], where a and b are integers and b > a, will yield a consecutivesequence of integers that begins with a and ends with b. What happens ifb < a?


T2 in Activity 1 differs from T1 in that the difference between successiveterms is 2. Upon receiving input such as [4, 7..15], ISETL constructs thesequence [4, 7, 10, 13]. In general, if a < b < c, with a, b, and c integers,the ISETL tuple [a, b..c] returns an arithmetic sequence with commondifference b− a whose first two terms are a and b and whose last term doesnot exceed c. What does ISETL return if a < b < c does not hold?

How would a decreasing sequence be constructed? Non-arithmetic se-quences can also be constructed by simply listing all of the elements (T4 inActivity 1 is an example) or by using a formula (T5 in Activity 1 is an exam-ple). Of course, arithmetic sequences can also be expressed in either of theseways.

The components of a tuple do not have to be integers, or even numbers.They can be any ISETL objects, including other tuples. U in Activity 1 issuch an example: The first term of this sequence is 1, the second is 2, thethird is the tuple T2, the fourth is the proposition 3 < 2, and the fifth is thetuple whose elements are the numbers 3.5 and −100.

One of the most important facts about tuples is that their elements comein a fixed, definite order. Each time you evaluate a tuple, you get the samesequence in the same order. This should not be surprising, since a sequenceis a function whose domain is the set of integers: the output correspondingto the integer 1 is the first element of the sequence, the output correspondingto the integer 2 is the second element of the sequence, and so on. Since tuplesretain their original ordering, it is possible to access specified components ofa tuple. As you discovered in Activity 1, the value of the expression T1(5) isthe value of the fifth component of the tuple T1. What would be the “value”of the seventh component of U?

Sets

The ISETL object set is exactly the same as a finite set in mathematics. Theelements of a set can be any ISETL objects, including other sets, as in theset N of Activity 1. This includes the empty set {}, as in {1, {}, {1,2}}.Sets can also have tuples for elements, as in Z2 3 of Activities 4 and 5. Setscan also be elements of tuples. Can you construct such an example?

As with tuples, a code of the form {a .. b}; returns the set of consec-utive integers starting with a and ending with b, provided that a and b areintegers and a < b. If b < a, then ISETL will return the empty set. Similarly,if a, b and c are integers and a < b < c, the set {a, b..c}; will return a and


b, with subsequent elements obtained by adding the constant difference b−asuch that no term exceeds c. What elements are returned if the conditiona < b < c does not hold?

Sets and tuples differ in two important ways. In a set, order and repetitiondo not matter, while the opposite is true for tuples. For instance, the set{1, 2, 3} is equal to any set consisting of any permutation of the elements1, 2, and 3. For example, {1, 2, 3} = {2, 3, 1} . On the other hand, thetuple, or sequence, [1, 2, 3] is not equal to the tuple [3, 2, 1]. This is what youdiscovered in Activity 1: the terms of the tuple T1 are always presented inthe order in which they were entered. This is not the case with the set Z20;ISETL lists the elements of Z20 in varying order.

Repetition, like order, is a second distinguishing characteristic of se-quences. This is why the sequences E and E2 in Activity 1 are not equal.However, when the elements of E and E2 were entered as sets, to produceF and F2, the result was different: You had found that F = F2 is true. Re-peated elements of a set are disregarded: When one uses the code #, withwhich ISETL produces the number of elements, one can see that a repeatedelement in a set is counted but once, while in a tuple it is counted as manytimes as it appears. As a result, tuples and sets react differently to the opera-tion with. What is the difference? Use the results you obtained in Activity 1to explain this difference.

Tuple and Set Formers

In addition to defining sets and sequences by listing their elements or terms,the set and tuple objects can be defined using former notation. In Activity 3,the sets Z20, H, K, L, HK and S are all defined using set-former notation.Whether a set or a tuple, the former has three parts. The first is an expres-sion. Every variable that appears in the expression must either have beenassigned values previously or appear in the second part of the former. Thefirst part of the former is completed with a colon (:). The second part, calledthe domain specification, takes unassigned variables in the expression anditerates them through previously defined sets or tuples. For the set HK :=

{6 * n : n in [1..5]}, the first part of the former is the expression 6 *

n, and the second part indicates that n is an element of the tuple [1..5].Although it is not necessary for every variable that appears in the domainspecifier to appear in the expression, it is required that every unassignedvariable that appears in the expression must also appear in the domain spec-


ifier. For example, the tuple r:=[p(i) + q(i) : i in [1..3]] given inActivity 3 is the component-wise sum of previously defined tuples p and q.As a result, p and q do not need to appear in the domain specifier. Theindex i is the undefined variable iterating through [1..3]. The last partof the former notation is optional. If present, it begins with the symbol (|)and is followed by a boolean expression, that is, an expression whose value istrue or false. For example, L := {g * h : g, h in Z20 | even(g) and

h < 10} is the set of all the numbers produced from the expression g * h

by substituting all the possible combinations of even elements in Z20 for g,and elements smaller than 10 in G for h.

When presented with former notation, ISETL constructs the set (or tu-ple) by iterating through all possible combinations of values of the variablesdefined in the domain specification. For each combination of values of thevariables, the boolean expression in the third part is evaluated. If the thirdpart is not present, then the boolean value is automatically assumed to betrue. If the result of evaluating the boolean expression is false, then noth-ing more is done, and ISETL moves on to the next combination of valuesfor the variables. If the boolean expression is true (or not present), thenISETL evaluates the expression and returns the result as an element of theset. Thus, the value of a former expression is the set of all values of theexpression obtained by iterating the variables through their domains suchthat the condition in the third part holds. This is similar to what you wereasked to do in Activity 7.

Set Operations

ISETL can perform usual set operations. You used these operations in Ac-tivity 3. Recall, as you read the following summary, that the convention inthis text is that any word in typewriter font is an ISETL keyword.

1. The basic idea of a set is that any object is either in the set or it is notin the set. You can test for set membership using the operation in.

2. The union (union) of two sets A and B is the set of all values which areelements of A, B, or both. The intersection (inter) of two sets A and B

is the set of all elements which are contained in both A and B.

3. A set A is a subset (subset) of a set B if every element of A is also anelement of B.


4. The difference between two sets A and B (-) is the set of all elementswhich are in A but not in B.

5. The value in ISETL of {} is the empty set—the set which has no ele-ments.

6. The cardinality operator (#) applied to a set A returns the number ofelements in A.

7. The operation pow applied to a set A constructs the set of all subsetsof A. This is called the power set of A, denoted in ISETL by pow(A).

When the set A is finite, pow(A) can be worked out by first putting intoit the empty set {}, then all one-element sets consisting of one of theelements of A, then all the possible two-element subsets of A (consistingof two of the elements of A), and so on, until the largest subset of A, theone of greatest cardinality, which is A itself. What is the cardinality ofthe power set of an arbitrary, finite set?

8. The operation arb selects an arbitrary element of a set.

Tuple and Set Operations

In Activity 1, you were introduced to operations applied to tuples. Forinstance the code %+[3..9] tells ISETL to find the sum of the terms ofthe sequence 3, 4, 5, 6, 7, 8, 9. If the addition sign were replaced with amultiplication sign, then ISETL would find the product of the terms of thesequence.

If the terms of a tuple consist of boolean expressions, that is, statementsthat can be judged to either true or false, code such as %or[3 < 2, 2 > 1,

6 = 7] instructs ISETL to return the value true if one of the statements istrue. What would the code %and[3 < 2, 2 > 1, 6 = 7] yield?

Sets of Tuples

In Activities 4 and 5, you instructed ISETL to construct the set Z2 3. Whenyou typed Z2 3, ISETL returned the set of all possible combinations of 3-tuples (tuples with three components) whose entries were either 0 or 1. Asdiscussed in the previous subsection on sets, the elements of a set in ISETLcan be any ISETL objects, including, as you saw in these activities, tuples.


There are a variety of ways to represent a set of tuples. One is to simplylist each tuple given in the set. For example, if A is the set of all 2-tuples(tuples consisting of two components) of all possible two-element orderingsof the first three counting numbers, then we could represent A in ISETL bylisting each element:

A := {[1, 1], [1, 2], [1, 3], [2, 1], [2, 2],

[2, 3], [3 ,1], [3, 2], [3, 3]};

On the other hand, it would be more convenient to use former notation:

A := {[a, b] : a, b in {1, 2, 3}};

Whenever defining a set of tuples using former notation, the expression, orfirst part of the set former, will consist of a tuple whose components arevarious expressions. For instance, if we want to define a set B consisting ofall 3-tuples (tuples with three components) of elements of Z20 in which thefirst component is always zero, the second is always even, and the third istwo more than 3 times the second, then, using set former notation, we wouldwrite:

B := {[0, b, ((3 * b) + 2) mod 20] : b in Z20 | even(b)};

Tuples will be used frequently throughout the text. Tuples constitute aspecial and important kind of vectors. Vectors are important objects ofstudy in linear algebra. Sets of tuples will often constitute a vector space.A vector space is a set of vectors with two operations that satisfy certainconditions.

Quantification

Quantified logical statements are used in mathematics to express conditions,usually in a definition, statement of a property, or a construction. forall

involves a universal quantifier. In order for a statement involving a universalquantifier to be true, the condition must hold for all possible values of thevariable attributed to it. exists involves an existential quantifier. In orderfor a statement involving an existential quantifier to be true, only one of thevalues of the variable attributed it has to be true.

In Activity 5, you were asked to evaluate several statements involvinguniversal quantification. The ISETL statement forall x in Z20 | (x +


0) mod 20 = x illustrates the standard form of a universal quantifier: thefirst part begins with the key word forall and is followed by a domainspecification. The domain specification is completed with the symbol |. Thesecond part of the quantifying statement is a boolean expression, that is, thecondition that the variable must satisfy.

To evaluate a universal quantification expression, ISETL iterates throughthe values of the variable in the domain specifier. Thus, in this example,it considers every value of x in the set Z20. For each value, the booleanexpression is evaluated. If the boolean expression is found to be false forjust one x, then the entire universal quantification statement is false. If theboolean expression is true for every x given by the domain specifier, then thevalue of the quantification is true.

The existential quantifier is similar, except that it returns true if the valueof the boolean expression is true at least once. Consequently, an existentialquantifier returns false only when the boolean expression is false for everysingle value of the variable.

The operation choose is a useful alternative to exists. The syntaxis exactly the same, and choose performs the same internal operation asexists. Instead of returning true or false, however, choose will select andsubsequently return one value of the variable that makes the condition true.If there is no such value, choose will return OM. Thus, in the case of thefollowing statement from Activity 5, e := choose x in Z20 | (forall g

in Z20 | (x + g) mod 20 = g); ISETL will return the value of 0 for e.

Modular Arithmetic

In Activity 6 you could see differences between multiplication mod 7 andmod 5 in Z7 and Z5 respectively on the one hand, and multiplication mod 20and mod 6 in Z20 and Z6 respectively on the other hand. These differencescan be seen when comparing these sets

K := {(5 * g) mod 7 : g in Z7};

H := {(2 * g) mod 7 : g in Z7};

K1 := {(3 * g) mod 5 : g in Z5};

H1 := {(2 * g) mod 5 : g in Z5};

With the following sets:

K2 := {(5 * g) mod 20 : g in Z20};


H2 := {(2 * g) mod 20 : g in Z20};

K3 := {(5 * g) mod 6 : g in Z6};

H3 := {(4 * g) mod 6 : g in Z6};

What are the differences between them? Notice that H, for example, is asubset of Z7, and is determined by multiplication mod 7. Similarly, each ofthe other sets is a subset of some Zn and determined by multiplication modthat same n. What differences do you see in the relation between these setsand the Zn they are subsets of?

Exercises

1. How many elements are there in the following set? Use the # operatorin ISETL to check your answer.

{2, 3, 6 mod 4, {[1, 1] {-1, 2..5}}, {}};

2. List the elements in each of the following sets and note the number ofelements in each. Use ISETL to check your answers. If a set is empty,explain why.

{2..12}

{4..4}

{10..1}

{-2, 4..38}

{0, 3..-1}

{100, 90..-5}

{100, 90..100}

{100, 90..101}

{10 ,9..0}

{4, 4..8}

3. For each of the following sets of code, predict what result will be re-turned by ISETL and then check your answer on the computer.

T := {[3, 4], 3 + 4, 8};

7 in T; 4 in T; (1 + 7) in T;

3 + 5 notin T; 7 notin T;


T = {7 * 1, 15 mod 8};

T /= {}; T /= T; {} subset T; T subset T;

T subset {}; not({8} subset T);

{7, 1 + 6, 49 mod 6, 7 + 0} subset T;

#(T); pow(T);

[3, 2, 1] with 2 = [3, 2, 1];

{3, 2, 1} with 2 = {3, 2, 1};

4. Let Y and Z be defined by

Y := {Z5, 6.9, {2..10}, {{true impl false}, false},

(10 div -4)+16, {5 in {3,6..9}}, {{}}};

Z := {{false or true}, 28 mod 2, Z5, {}, true impl false,

{10, 2, 9, 3, 8, 4, 7, 5, 6}, {false}, abs(-6.9)};

For each expression in the following list, determine if the value is anelement of Y, an element of Z, an element of both sets, or neither.

(a) true

(b) false

(c) {true}(d) {false}(e) 13 + 1

(f) {2, 3..10}(g) Z5

(h) {}(i) {{}}(j) {0, 1..4}(k) {10, 9..2}

5. In the context of the previous exercise, do the following.

(a) List every expression whose value is in both Y and Z.

(b) List every expression whose value is in either Y or Z or both.


(c) List every expression whose value is in Y but not in Z.

(d) Can you write an ISETL expression that will give an answer toany of the above?

6. For each of the following ISETL set former expressions, give a verbalexplanation of the set and then list the elements. For example, if theexpression is {x**2 : x in 2..10 | x mod 2 = 0}; then the verbaldescription might be The set of all squares of the even integers from 2to 10. And the list would be {4, 16, 36, 64, 100};

(a) {x : x in {2, 5..10}};(b) {r : r in {2, 5..100} | r mod 5 = 0};(c) {t**4 + t**2 : t in {-6..6} | even(t div 3)};(d) {even(n) : n in {-3, -1..11}};(e) {(x * y) mod 3 : x, y in {-8, -7, 0, 7, 8} | x < y};(f) {{s, t} : s in {l0, 8..4}, t in {5..s} |

(s + t) mod 2 = 0};(g) {(p and q) = (q and p) : p, q in {true, false}};

7. (a) Construct addition and multiplication tables for addition mod 5and multiplication mod 5 in Z5 (fill in the given tables):

(a+ b) mod 5 0 1 2 3 4

0 0

1

2 1

3 2

4

(a · b) mod 5 0 1 2 3 4

0 0

1

2 3

3 2

4

(b) Construct similar tables for addition mod 6 and multiplicationmod 6 in Z6.


(c) Write a verbal description of all the differences you found betweenthese tables: Compare the addition tables to the multiplicationtables; Compare the tables of mod 5 (in Z5) to those of mod 6 (inZ6).

8. Write ISETL set former expressions for each of the following.

(a) The set of all 3-tuples of elements from a given set of integers K.

(b) The set of all possible sums of two elements, one taken from agiven set of integers S and one from a given set of integers T .

(c) The set of all 4-tuples of those elements from a given set of integersK that are even.

(d) The set of all subsets {a, b} of Z5, for which (a · b) mod 5 = 1.

(e) The set of all subsets {a, b} of Z5, for which (a+ b) mod 5 = 0.

(f) The set of all subsets {a, b} of Z6, for which (a · b) mod 6 = 1.

(g) The set of all subsets {a, b} of Z6, for which (a+ b) mod 6 = 0.

9. Compare the sets you constructed in Exercise 8, parts (d)–(g), to thetables you had constructed in Exercise 7.

10. Assume that S is a set and T is a tuple, both of which have been pre-viously defined in ISETL. Write an ISETL expression that will evaluateto a tuple whose components are the elements of S and the set whoseelements are the components of T .

11. Evaluate the following tuples and then use ISETL to check your answers.

(a) [x**2 : x in [1, 3..10]];

(b) [[1..r] : r in [0, 2..6]];

(c) [N + 2 < 2**N : N in [0..20]];

(d) [u * v : u in [-5..0], v in [-5..(u + l)] |

(u + v) mod 3 = 0];

12. Use the ISETL forall, exists, and choose constructs to write a codethat implements the following statements. Assume that S is the set ofmultiples of 3 from 0 to 49.

(a) Every odd number in {0 . . . 50} is in S.


(b) Every even number in S is divisible by 6.

(c) It is not the case that every even number in S is divisible by 6.

(d) There is an odd number in S, which is divisible by 5.

(e) There is an even number in S, which is divisible by 5.

(f) There is an element m of S such that the number of elements ofS that are less than m is twice the number of elements of S thatare greater than m.

(g) An element a of S exists such that for every element x of S, thereis an element y of S such that the average of x and y is a.

35

1.3 Functions

Activities

1. For each of the following sets of ISETL code, try to predict what theresult would be. Then run the code and check your prediction. Writeout a verbal explanation of what the code is doing.

(a) f := func(x);

return (x + 3) mod 6;

end;

f(5); f(0); f(37);

h := |x -> (x + 3) mod 6|;

h(5); h(0); h(37);

h=f;

forall x in [-10..10] | h(x) = f(x);

(b) fact := func(n);

return %*[1..n];

end;

fact(3); fact(5); fact(50);

forall n in [2..20] | fact(n) = n * fact(n - 1);

f:= func(x);

return (x + 3) mod 6;

end;

f(fact(3)); fact(f(114));

(c) Av := func(T);

return %+T/#T;

end;

Av([1, 2, 3, 4]); Av([1, 2, 3, 4, 0]);

CompAv := func(T,S);

return max(Av(T),Av(S));

end;

CompAv([1, 2, 3, 4], [1, 2, 3, 4, 0]);


What is the input of the function Av? What is its output? Whatis the input of the function CompAv? What is its output?

(d) Z6 := {0..5};

inv := func(x);

if x in Z6 then

return choose g in Z6 | (x * g) mod 6 = 1;

end;

end;

inv(2); inv(5); inv(3); inv(1); inv(0);

(e) Z20 := {0..19};

closed:= func(H);

return forall x, y in H | (x + y) mod 20 in H;

end;

closed({0, 4..19}); closed({3, 5, 9}); closed(Z20);

2. For each of the following specifications, write an ISETL func with thespecified input parameters that returns the result of the action that isdescribed. If any auxiliary objects are needed, then construct them aswell. Select some specific values and run your func on them to see thatit works.

(a) The input parameter is a single number x and the action is tocompute the square mod 20 of x.

(b) K is the set Z5 − {0} (Z5 without the element 0). The inputparameter is a single variable x and the action is to choose anyelement of K whose product mod 5 with x is 1.


(a) Z5 := {0..4};

add_5 := func(x, y);

if (x in Z5 and y in Z5) then

return (x + y) mod 5;

end;

end;

1.3 Functions 37

add_5(3, 4); .add_5(2, 3); add_5(4, 4);

3 .add_5 4; 2 .add_5 3; 4 .add_5 4

(b) G := Z5 - {0}; G;

mlt_5 := func(x, y);

if (x in G and y in G) then

return (x * y) mod 5;

end;

end;

forall x, y in G | x .mlt_5 y in G;

exists e in G | (forall g in G | e .mlt_5 g = g);

choose e in G | (forall g in G | e .mlt_5 g = g);

id := choose e in G |

(forall g in G | e .mlt_5 g = g); id;

forall g in G | (exists g’ in G | g’ .mlt_5 g = id);

4. (a) Using the following specification, write ISETL funcs with the spec-ified input parameters, that return outputs as described. If anyauxiliary objects are needed, then construct them as well. Selectsome specific values and run your funcs on them to see that theywork.

• Let G be Z7-{0} (namely, the set of integers from 1 to 6).

• Construct a function mlt 7 that takes for input two elementsof G and gives as output their product mod 7.

• Use the operation mlt 7 in the construction of a func namedinv 7. The input of inv 7 is a single value g. The func

should check that g is an element of G and, if it is, choosean element of G whose “product” with g under the operationmlt 7 is equal to 1.

(b) If your function inv 7 works properly, predict and then check inISETL the values of the following expressions:

inv_7(3); inv_7(5); inv_7(2); inv_7(1); inv_7(6);

3 .mlt_7 5; 2 .mlt_7 4; 1 .mlt_7 1; 6 .mlt_7 6;

forall g in Z7 - {0} | (exist g’ in Z7 - {0} |

g’ .mlt_7 g = 1);


Keep the codes you constructed here for future use.

(c) Repeat Activities 4(a) and 4(b) with Z6. Check the values youobtained with your multiplication-mod-6 table. How are theseresults reflected in this table?

You may wish to keep these codes as well.

(d) Write a comparison between the behavior of the functions inv 7

and inv 6. How are they similar? How different?

(e) Repeat Activities 4(a) and 4(b) with Z5. Check the values youobtained with your multiplication-mod-5 table.

Keep the codes you constructed here for future use.

(f) What is the function inv 5 like: inv 6 or inv 7? Explain.

5. Write ISETL funcs with the given names according to each of the follow-ing specifications. In each case, set up specific values for the parametersand run your code to check that it works.

(a) The func is closed has two input parameters: a set G and afunc o which is some operation on two variables from G such asthe one in Activity 3. The action of is closed is to determinewhether the result of o, when applied to two elements of G, isalways an element of G. This is indicated by returning the valuetrue or false.

(b) The func is commutative has two input parameters: a set G anda func o which is some operation on two elements from G such asthe operations in Activity 3. The action of is commutative isto determine whether or not the result of o depends on the orderof the elements. This is indicated by returning the value true orfalse.

(c) The func identity has two input parameters: a set G and afunc o which is some operation on two variables from G such asthe operations in Activity 3. The action of identity is to searchfor an element e of G which has the property that for any elementg of G, the result of the operation o applied to e and g is again g.This is indicated by returning the value of e if it exists or OM if itdoes not.

1.3 Functions 39

(d) The func inverses has three input parameters: a set G, a binaryoperation o on G, and an element g of G. We assume that G ando are such that identity(G, o) is defined (does not return OM).The action of inverses is to first assign the value of identity(G,o) to a variable e, then search for an element g’ of G which hasthe property that the result of the operation o applied to g’ andg is e. The func inverses returns the element g’ if it exists orOM if it doesn’t.

6. (a) Write an ISETL func invertibles, that takes for input a set G

and an binary operation o defined on G. We assume that G ando are such that identity(G, o) is defined (does not return OM).The action of invertibles is to construct the set of all elementsin G that have an inverse with respect to o in G. Namely, all theelements g in G such that there exists an element g’ in G such thatg .o g’=identity(G, o).

(b) What does your func do? What are invertibles?

(c) In the previous activities you constructed the funcs mlt 5, mlt 6

and mlt 7. Operate your func invertibles on the followinginputs: (Z5, mlt 5), (Z6, mlt 6), (Z7, mlt 7). (Use .mlt 5,.mlt 6 and .mlt 7 as in Activity 3.)

(d) Summarize your findings in (c): What did your func check? Whatdid you find? How does (Z6, mlt 6) differ from (Z5, mlt 5) and(Z7, mlt 7) (from the point of view of the invertibles)?

(e) Do you see the differences between (Z6, mlt 6) and (Z5, mlt 5)

in the multiplication tables you constructed in Exercise 7 in Sec-tion 1.2? How are these differences represented there?


(a) p := [[1, 3], [2, 4], [3, 2], [4, 1]];

p(1); p(3);

s := {[1, 3], [2, 4], [3, 2], [4, 1]};

s(1); s(3); s(5);


Write an explanation: What did the expression p(i) do for thetuple p? What did the expression s(i) do for the set of tuples s?

s’ := s with [1, 5]; s’;

s’(3); s’(1);

st := [s(i) : i in [1..4]]; st;

st(1); st(3);

(b) next := func(n);

if n in [1..10] then return n + 1;

end;

end;

mnext := {[n, n+1] : n in [1..10]};

mnext;

next(6); mnext(6); next(9); mnext(9);

next(11); mnext(11);

forall n in [0..11] | mnext(n) = next(n);

(c) G := {1..12};

o := func(x,y);



end;

end;

m13 := {[[x, y], x .o y] : x, y in G};

m13;

m13(3, 5); m13(2, 4);

3 .m13 5; 2 .m13 4;

forall x, y in G | x .m13 y = x .o y;

8. Construct functions of your own, with the following specifications:

(a) The function takes two tuples for input, and returns a tuple con-sisting of the sum of each pair of their components for output.

1.3 Functions 41

(b) The function takes a tuple and a number for input, and returns atuple consisting of the product of the number and each componentof the tuple for output.

(c) The function takes two tuples and two numbers for input, andreturns a tuple for output. This function should multiply the firstnumber with the first tuple (as in part (b)), multiply the secondnumber and tuple, and then add the results as in part (a).

9. Describe what the following code is doing.

SetNot := proc(pair);

G := pair(1); o := pair(2);

e := choose x in G | (forall g in G | x .o g = g);

inv := {[g, choose g’ in G | g’ .o g = e] : g in G};

end;

pair := [ ];

pair(1) := {0..5};

pair(2) := func(x, y);



end;

end;

pair;

SetNot(pair);

G; e; inv(5); inv(2);

What would happen if, after running this code (and defining the funcsin Activity 5), you entered statements such as 3 .o 5, is closed(G,

o), is commutative(G, o), identity(G,o)?

Can you imagine why we might want to go to all the trouble of some-thing like SetNot?

10. An ISETL func can also return a func. Look at the following code andwrite down an explanation of what it does:

add_a := func(a);

return func(x);


return x + a;

end;

end;

Predict the results of the following ISETL statements:

add_a(3); add_a(3)(5); add_a(3)(2); add_a(7)(4);

f3 := add_a(3); f3(5); f3(2);

f7 := add_a(7); f7(4);

11. Write down an ISETL func compose which takes as input two ISETLfuncs f and g and returns a func representing their composition; thatis, the func compose returns a func which for each input x returnsf(g(x)). Use your func to compose some of the funcs previouslydefined in the Activities, and study the resulting new funcs.

Discussion

Funcs and Their Syntax Options

The notion of function is fundamental to every area of mathematics. Afunction has to do with transforming an input value, or a collection of inputvalues, into a single output value. In ISETL there are several ways to representfunctions, one of which is the func. What kinds of inputs are used in thefuncs of Activity 1? What kinds of outputs are used? A func processes(transforms) the input to obtain an output. The syntax of funcs is designedto describe these processes. There are a variety of processes demonstratedin the funcs of the activities. Let us look at the function of Activity 1(e)again:

inv := func(x);

if x in Z6 then

return choose g in Z6 | (x * g) mod 6 = 1;

end;

end;

Funcs must have a header line, a list of statements for ISETL to process,and an end statement. The header line for a func usually starts naming the

1.3 Functions 43

func. This is done by an assignment to an identifier. In our example, theline inv := func(x); assigns the name inv to the function, followed by thekeyword func and the actual values of parameters enclosed in parentheses.The complete expression is followed by a semicolon.

The func in our example has an if statement for its primary process. Ifstatements terminate with an end; or an end if; command—in our exam-ple, the first of the two end;’s. As we can see in our example, the label thatindicates which process is being completed (end if; end func;) is optional.Every func and every control statement must have its own end statement. Inthe example, the func also includes a return statement which causes ISETLto evaluate the expression and end the processing. Return statements canonly be used inside a func. They instruct ISETL to send the current value ofthe expression to the computer screen or to some other process in which thefunc may be embedded. A return statement causes the operation in a func

to stop. In our example, for an input which is a number in Z6, the func

chooses an element g in Z6 such that (x · g) mod 6 equals 1, and produces iton the screen.

So the general syntax for funcs is as follows:

name := func(list of parameters);

statements;

end;

Once a func is defined (and recorded by ISETL) it can be used withdifferent values of inputs (or parameters). The syntax for operating a func

with some (properly) chosen input is: name(parameters). In our example, asthe function was assigned to the identifier (the name) inv, and the input is anumber in Z6 (otherwise the func responses with OM), the func will operatein response to an expression such as inv(5);.

Funcs for Binary Operations

When a func has two input parameters, which are assumed to represent twoelements from some set, and the returned value is also assumed to belong tothe same set, then the func is said to represent a binary operation. Binaryoperations are extremely important in Algebra in general, and also in LinearAlgebra, and you will spend considerable time working with them.

In mathematics, it is the usual practice to write the name of a binaryoperation between the two parameters, rather than before them. Thus, we


write a+ b rather than +(a, b). We can do something very similar in ISETL.If o is a func which fulfils the requirements for a binary operation (inputconsisting of two parameters, output a single parameter, all three elementsof the same set) then, instead of o(e, g) we have the option of writing e

.o g. Putting the period before the name of the operation is the signal toISETL that the operation is between the two parameters, not before them.You used binary operations in this way in Activity 3. What about the func

Av of Activity 1(c)? Can it be used as a binary operation? Why, or why not?In Activity 8, you were to construct funcs to work with tuples and num-

bers. You probably needed to decide the size of the tuples involved and howto deal with them in the code. For instance, in part (a) you might have usedcode like:

act8a := func(tup1, tup2);

[a, b, c, d] := tup1;

[e, f, g, h] := tup2;

return [a + e, b + f, c + g, d + h];

end func;

which works well for a tuples of length 4. A perhaps more elegant approachis:

act8a’ := func(tup1, tup2);

return [tup1(1) + tup2(1), tup1(2) + tup2(2),

tup1(3) + tup2(3), tup1(4) + tup2(4)];

end func;

but neither of these will work with any length tuple or make use of the powerof ISETL. Compare the following code to the previous examples. Can youexplain how it works?

act8a’’ := func(tup1, tup2);

return [tup1(i) + tup2(i) : i in [1..#tup1]];

end func;

This func computes the “component-wise” sum of two tuples.

Funcs to Test Properties

In Activity 5 you constructed several funcs that tested binary operationsfor various properties. The func is closed that you were asked to write in

1.3 Functions 45

Activity 5(a) for example, had two input parameters, a set G and a binaryoperation o. It was assumed that before you call this func you would havedefined a set and a binary operation. The definition of the func would containa single boolean expression. The value of this expression was returned as theresult of a call to this func. Look back at the funcs you constructed inActivity 5, and write a description of the property that is tested by each ofthe funcs you were asked to construct in this activity.

Tuples and Smaps

In Activity 7(b) you compared two ISETL objects: the func next, whichassigns to each number in [1..10] the next larger number, and the set ofordered pairs mnext := {[n, n+1] : n in [1..10]}; Hopefully you alsodiscovered that expressions like mnext(6) = next(6) return true.

In ISETL, an ordered pair is a tuple with only two components defined.A set of pairs is a called a map. The map gives us one more way to representfunctions in ISETL—as in mathematics. As you can see in the above example,the way a map represents a function is that it assigns the second componentof each ordered pair to the first component of the same pair. However, notevery map represents a function. Since a function assigns to each elements inits domain a single value, in order for a map to represent a function it musthave the following additional property:

No value appears as a first component of more than one pair inthe map.

A map with this property is called an smap (single-valued-map). Any timean smap is constructed and assigned to a variable, that variable can be usedas a function.

We have seen before, (in Section 1.2, Activity 1) that tuples too can beused as functions. Use the examples of Activity 6(c) to discuss the differencesbetween the way tuples operate like functions, and the way smaps do.

Procs

Activity 7 has an example of an ISETL proc or procedure. A procedure is thesame as a func except that it has no return statement and does not returna value. It is used to perform some internal operations such as establishingthe values of certain variables. In our example, it establishes the value of G,


o, e, and the function inv. It is also used for external effects on the screen(such as printing or drawing something), or other devices (disk, printer, andso forth).

The Fields Zp

In the activities you dealt with the properties of structures consisting of thesets Zn and their binary operations mlt n and add n. Mlt n was defined tobe multiplication mod n, and add n was addition mod n. We will denotesuch a structure by the 3-tuple [Zn, add n, mlt n]. Such are [Z5, add 5,

mlt 5], [Z6, add 6, mlt 6], [Z7, add 7, mlt 7] and others. In rela-tion to these structures, you dealt with several new concepts:

• Identity was an element related to an operation within its respectiveZn, which when operated with another element produced that otherelement as a result.

• Inverse of an element was an element related to an operation, its iden-tity, and a specific element g in the corresponding Zn: g′ was said tobe an inverse of g with respect to an operation o in its respective Zn,if gog′ = the identity (of the same operation).

• An element which had an inverse respective to a specific operation in itsappropriate Zn was called an invertible (in relation to that operation).

In addition to work with the ISETL code you looked for the properties of theoperations in the operation-tables you constructed in Section 1.2, Exercise 7.How do you identify each of the concepts (Identity, Inverse and Invertibles)in these tables?

We hope that you became aware of some basic differences among [Z6,

add 6, mlt 6], [Z5, add 5, mlt 5], and [Z7, add 7, mlt 7]. The maindifference concerns the above concepts: While all three structures have anidentity for the operations mlt n, not every element in each of the structureshas an inverse (in relation to mlt n).

The element 0 is not invertible with respect to mlt n in any of the struc-tures.

While in in Z5 and in Z7 all the other elements (g/=0) have inverses, inZ6 not all do. Which are the elements of Z6 which do not have such inverse(are not invertible)?

1.3 Functions 47

In the rest of the course we will learn about structures named vectorspaces. These structures are constructed upon (and rely on) structures ofnumbers called fields. Fields have some important characteristics. Let usexplain these characteristics.

• A field consists of a set K and two binary operations operating on it;

• Both operations are closed (see Activity 5);

• Each operation is commutative and associative (see Activity 5);

• There is a relation between the two operations called distributivity,which you probably know but with which we will not now deal;

• Each operation has an identity;

• For one of the operations every element has an inverse and the identityof this operation is usually denoted by 0;

• As for the second operation, all the elements of K−{0} have inverses.That means that every element has an inverse, except the identity ofthe first operation!

We will not keep working with fields. Such work belongs to anotheralgebra course. For the time being, we need to know which Zn’s are fields,and hence, can build vector spaces upon them.

From your work with Zn and their operations, you could see that Z6 isnot a field. Although it has an identity for every operation, not every elementof Z6−{0} is invertible in relation to multiplication mod 6. Which elementsof Z6 are not invertible?

We will not prove this here, but it can be proved, that like in our examples,when n is prime, Zn is a field. Examples are Z5 and Z7. We will use themto construct vector spaces upon. Can you think of more examples of fields?

But when n is not prime, like Z6 (because 6 = 2 · 3), Zn is not a field.Likewise are Z12 and Z4 with which you worked. We will not use them for theconstruction of vector spaces. Can you think of other examples of non-fields?

Polynomials and Polynomial Functions

An important collection of functions which can be implemented in ISETL arethe polynomial functions. We start with the definition of a polynomial.


Definition 1.3.1. A polynomial in x is an expression of the form

a0 + a1x+ a2x2 + a3x

3 + . . . anxn.

The values ai are called the coefficients of the polynomial. The polynomialis said to be over the numbers K if all of the coefficients are in the set K. Apolynomial for which an 6= 0 but am = 0 for all m > n is said to have degreen.

Notice that a polynomial is not a function, it is only an expression usingthe variable x. Two polynomials are equal if and only if all of their coefficientsmatch. In ISETL, a polynomial is implemented as a tuple of scalars in orderof increasing degree. For example, the polynomial 2 + 3x2 is implemented asthe scalar [2, 0, 3]. The collection of all polynomials of degree less thanor equal to n over the numbers K is denoted by Pn(K). The collection of allpolynomials over K is denoted by P(K).

Each polynomial expression has an interpretation as a function. We nowdefine a polynomial function.

Definition 1.3.2. A polynomial function is a function p where p(x) is apolynomial in x.

In ISETL, polynomial functions are represented as funcs. Over the realnumbers, different polynomials lead to different polynomial functions. How-ever, over the finite fields Zp polynomial functions can behave in rather un-expected ways. For example, over Z5, consider the polynomial functions pand q where p(x) = x and q(x) = x5. Compute p(a) and q(a) for everya ∈ Z5. You should discover that p = q despite the fact that the polynomialsp(x) and q(x) are different. As a result, defining the degree of a polynomialfunction is more complicated than it seems.

Definition 1.3.3. A polynomial function has degree n if it is the interpreta-tion of an n degree polynomial and not an interpretation of any lesser degreepolynomial.

The collection of all polynomial functions of degree less than or equal to nover the numbers K is denoted by PFn(K). The collection of all polynomialsover K is denoted by PF(K).

Theorem 1.3.1. Over the field Zp, the polynomial functions x −→ x andx −→ xp are the same. As a result, PF(Zp) = PFp−1(Zp).

1.3 Functions 49

Exercises

1. Each of the following is a description of a function. Use an ISETL func

to implement it, with a restricted domain where appropriate. Calculatethe value of the function on at least three values of the domain in twoways: first with paper and pencil, using the given verbal description andsecond, on the computer, using your func. Explain any discrepancies.

(a) The function takes a 4-tuple (a tuple with four elements) whosecomponents are positive integers and computes the sum of thecubes of the components.

(b) The domain of the function is the set of all points of the squarecentered at the origin of a rectangular coordinate system with sidesparallel to the coordinate axes and having length 2. The action ofthe function is to rotate the square counterclockwise through anangle of 180◦.

(c) The function determines the number of non-OM components of atuple. (Note that not all of the components of a tuple betweenthe first and last must be defined).

2. Write ISETL funcs with the given names according to each of the follow-ing specifications. In each case, set up specific values for the parametersand run your code to check that it works.

(a) The func is associative has two input parameters: a set G

and a binary operation o. The action of is associative is todetermine whether the operation represented by o is associative.This is indicated by returning the value true or false.

(b) Construct a func add 7 that implements addition mod 7 in Z7.Use your construction in ISETL code that shows that every elementin Z7 has an inverse in relation to add 7.

(c) Repeat part (b) with Z6 and add 6.

3. Write an ISETL map that implements the function whose domain is Z20

and assigns to each element x an element y such that (x+y) mod 20 =0. Is this an smap? Explain.


4. Do the same as the previous exercise with addition replaced by multi-plication and 0 replaced by 1.

5. Write an ISETL smap that implements the operation of addition mod20 in Z20.

6. Write an ISETL func that accepts a pair consisting of a set G and abinary operation o on G. The action of the func is to convert this pairinto an smap which implements the operation. Use your func to do theprevious exercise.

7. Construct a function takes a tuple (of any length) for input and returnsa set all of the components of the tuple for output.

8. Look again at the func Av of Activity 1(c). Is it or is it not a binaryoperation? Write an explanation. If it is, use it with several appropriateinputs as an operation written between the two parameters. If it is not,what modifications need to be done to Av in order to produce a similarfunction (say, AvBin) which could be used as a binary operation, inparticular in the method described above.

9. Write a tuple and a smap of your own. Operate both as functions.For each of them, write an explanation: What are the inputs of thefunction? What is its domain? (The domain is the set of elements youcan “input” in the function). What are its outputs?

Chapter 2

Vectors and Vector Spaces

You have seen vectors in your physics classes, inmultivariable calculus and perhaps in othercourses. In those cases, vectors were probablyconsidered to be “things with direction andmagnitude” and were usually represented asdirected line segments. In this chapter andbeyond, we will be working with vectors in anabstract sense. Certainly the vectors with whichyou are already familiar will be included in ourwork (although they will all have their “tails” atthe origin). However, as we work with vectorsand vector spaces, you will find that polynomialsand infinitely differentiable functions are alsovectors.

52

2.1 Vectors

Activities

1. (a) Define the set K = Z5 = {0, 1, 2, 3, 4} in ISETL.

(b) Write an ISETL func add scal that accepts two elements of Kand returns their sum mod 5.

(c) Write an ISETL func mult scal that accepts two elements of Kand returns their product mod 5.

2. (a) Define V = (Z5)2, that is, the set of all 2-tuples with componentsfrom Z5 in ISETL. How many elements are there in V ?

(b) Write an ISETL func vec add that accepts two elements, [v1, v2]and [w1, w2] of V and returns the tuple [(v1 + w1) mod 5, (v2 +w2) mod 5].

(c) Write an ISETL func scal mult that accepts an element k ofZ5 and a tuple [v1, v2] from V , and returns the tuple [(kv1) mod5, (kv2) mod 5].

3. Define the tuples v = [2, 3], w = [1, 1], and u = [0, 3] in ISETL. Use yourfuncs defined in Activities 1 and 2 to determine whether the followingtuples are the same.

(a) v + w and w + v.

(b) (u+ v) + w and u+ (v + w).

(c) v + v and 2v.

(d) −1v and −v.

(e) v +−1v and v − v.

(f) 2(3u) and (2 ∗ 3)u.

(g) 2(v + w) and 2v + 2w.

4. How is the following code different from the func vec add you wrotein Activity 2? What assumption does this code make about u and v?

va := |v, w -> [(v(i) + w(i)) mod 5 : i in [1..#v]]|;

2.1 Vectors 53

Use va to add the following tuples in (Z5)n. Can you add these tuplesusing vec add?

(a) [2, 2, 1] + [3, 0, 4]

(b) [0, 1, 0, 1] + [1, 2, 3, 4]

(c) [1, 2] + [2, 1]

5. (a) Write an ISETL func sm that accepts an element k from Z5 anda tuple v, and returns the tuple kv in which each component of vhas been multiplied (mod 5) by k.

(b) Test your func for k = 3 and v = [2, 4].

(c) Test your func for k = 0 and v = [1, 3, 3].

(d) Test your func for k = 1 and v = [3, 2, 4, 1].

6. (a) Write an ISETL func is closed va that accepts a set V of tuplesand an operation va (vector addition). Your func should testwhether the sum of any two tuples in V is again in V .

(b) Test your func on V = (Z5)2, with va defined in Activity 4.

(c) Test your func on V = (Z3)3. Modify va appropriately, usingmod 3 arithmetic.

(d) Test your func on V = (Z2)4. Modify va appropriately, usingmod 2 arithmetic.

7. (a) Write an ISETL func is commutative that accepts a set V ofvectors (tuples) and an operation va and determines whether ornot the operation va is commutative on V .

(b) Test your func on V = (Z5)2 and va.

(c) Test your func on V = (Z3)3 and an appropriately modified va.

(d) Test your func on V = (Z2)4 and an appropriately modified va.

8. (a) Write an ISETL func is associative va that accepts a set V ofvectors (tuples) and an operation va, and determines whether ornot va is associative on V .



54 CHAPTER 2. VECTORS AND VECTOR SPACES

9. Explain the following ISETL code. What are the inputs to this func?What does this func return?

has_zerovec := func(V, va);

VZERO := choose z in V | forall v in V | (v .va z) = v;

return VZERO;

end;

10. (a) Use the func has zerovec to write a new func has vinverses

that accepts a set V of tuples and operation va and determineswhether or not for each x in V there is an y in V with the propertythat va(x, y) = the result of has zerovec(V, va).



(d) Test your func on V = (Z2)4 and an appropriately modified va.

11. Explain the following ISETL code:

is_closed_sm := func(K, V, sm);

return forall k in K, v in V | (k .sm v) in V;

end;

12. Write an ISETL func is associative sm that accepts a set K ofscalars, a set V of vectors, and two operations, sm (scalar multipli-cation) and ms, multiplication of scalars. Your func should determinewhether for all k, j in K and all v in V k(jv) = (kj)v. Note that theright hand side of this equation uses“multiplication of scalars” as wellas scalar multiplication. What is the difference? Test your func on(Z2)2.

13. What does the following ISETL func do? What are the inputs? Theoutputs?

has_distributive1 := func(K, V, sm, va);

return forall k in K, v, w in V |

(k .sm (v .va w)) = (k .sm v) .va (k .sm w);

end;

2.1 Vectors 55

14. Write an ISETL func has distributive2 that accepts a set K ofscalars, a set V of tuples, and three operations, va, vector addition,sm, scalar multiplication, as, addition of scalars. The action of yourfunc is to determine whether the following expression holds for all k, jin K and v in V : (k + j)v = kv + jv.

15. Write an ISETL func has identityscalar that accepts a set K ofscalars, a set V of vectors (tuples), and an operation sm, scalar multi-plication. The action of your function is to determine whether there isan element k in K such that for all v in V , sm(v, k) = v.

Discussion

In these activities you created tuples with components chosen from Z2,Z3, or Z5, and wrote code to perform operations on those tuples. Suchtuples are more commonly known as vectors. A vector can be any tuplewith components in a set K of scalars. In ISETL we denote vectors by v =[v1, v2, · · · , vn]. In mathematical notation we write v = 〈v1, v2, · · · , vn〉. Thenumbers vi are known as the “components” of the vector.

Any specific vector might be thought of as living in several different“spaces”. For example, v = 〈2, 2〉 could be an element of the space (Z3)2,or of (Z5)2, or of R2. (Why can is it not an element of (Z2)2?) If wechoose to work within a specific space (K)n, then we can combine vec-tors with each other using an operation of “vector addition”. The ad-dition is done component-wise. For example if we are working with 2-tuples with entries from Z5, the sum of v = 〈v1, v2〉 and w = 〈w1, w2〉 is〈(v1 + w2) mod 5, (v2 + w2) mod 5〉. We also have an operation of “scalarmultiplication” which allows us to combine scalars with vectors. This mul-tiplication is also done component-wise. There is a natural relationship be-tween this vector addition and scalar multiplication that is very satisfying.For example v + v results in the same vector as 2v. Linear algebra is builton these two operations of adding vectors and multiplying by scalars.

In these Activities you worked exclusively with finite sets of scalars andvectors. This is because much of our work in ISETL requires us to be able todefine finite sets. However, many of the ‘real-world’ applications of vectorsthat you will see in this and other courses deal with infinite sets of scalarsand vectors. For example R2 is the set of all ordered pairs of real numbers.The vectors in R2 have both a physical (as forces or velocities) and geometric


interpretation. In R2 vectors can be thought of as quantities that have botha direction and magnitude. We can represent the vector v = 〈4, 2〉 by anarrow in a two-dimensional plane. The arrow will start at the origin (0, 0)

1 2 3 4

1

2

O

P

v

Figure 2.1: A vector in R2

and end at the point (4, 2). (See Figure 2.1). Such a vector has a magnitudeand direction, and shows both of its components simultaneously. The vectorw = 〈2, 1〉 has the same direction as v but is half as long. Can you see howto use the Pythagorean theorem to find the length of such vectors? What isthe relationship between the length of a vector v and the length of 2v? Ofv and kv?

Of course not all of the arrows in 2-space originate at (0, 0). We willconsider two vectors v1 and v2 to be equivalent if they have the same lengthand direction, even if they originate at different points. (See Figure 2.2).

1 2 3 4

1

2

O

P

B

A

v

Figure 2.2: Parallel vectors in R2

2.1 Vectors 57

Such vectors are obviously parallel arrows. One reason for allowing vectorsto start at different points is to be able to visualize the sum of two vectors.In Figure 2.3 we can form the vector v + w by translating w so that thestart of w is placed at the end of v. Then v + w is the arrow drawn from

1 2 3 4

1

2

4

3

w = <-1,2>

w'

v + w

v'

v = <4,2>

Figure 2.3: Adding vectors in R2

the start of v to the end of the translated vector w. Note that this geometricvector addition produces a parallelogram. To get v + w can can travel alongv and then along w or we can take the shortcut along the diagonal v + w ofthe parallelogram.

Use algebra to check that the geometric addition in Figure 2.3 is correct.In other words, is v + w = 〈−1, 2〉 equal to the vector 〈3, 4〉?

Given a vector v, what would we mean by the vector −v? How wouldyou draw −v in the real plane? What is the relationship between the lengthand direction of v and that of −v? How can we combine vector addition andmultiplication by −1 to obtain vector subtraction?

Ordered triples of real numbers can also be thought of as vectors andvisualized geometrically. In order to do this we need an xyz coordinatesystem. The set of all such ordered triples is known as 3-space or R3. Vectorsthat live in spaces with more than 3 components are not so easily visualized.However, many of the results and techniques of vector arithmetic are usefulin such situations where there is no direct geometric significance. This leadsus to the following definition.

Definition 2.1.1. The set of all sequences 〈v1, v2, · · · , vn〉 of real numbersis called Real n-space and is denoted Rn.


In Activities 5–15, you wrote or explained several ISETL funcs thatchecked various properties of vector addition and scalar multiplication. Sys-tems in which these particular properties are satisfied turn out to be veryuseful in the study of linear algebra. We will explore such systems further inthe next section.

Exercises

1. Compute the following vector expressions for

v = 〈2, 3〉 ,u = 〈−3, 1〉 , and w = 〈8, 0〉

(a) 12w

(b) v + u

(c) v + u + w

(d) 2v + 3u + w

2. (a) Draw the vectors v = 〈4, 1〉 and 12v in a single xy plane.

(b) Draw the vectors v = 〈4, 1〉 and w = 〈−2, 2〉 and v+w and v−win a single xy plane.

3. Compute the following vector expressions for

v = 〈1, 2, 3〉 ,u = 〈−3, 1,−2〉 , and w = 〈2,−3,−1〉 .

(a) v + w

(b) v + 3u

(c) w + u

(d) 5v − 2u + 6w

(e) 2v − 3u− 4w

4. To what number do the components of every scalar multiple of v =〈2, 1,−3〉 add up?

5. Use the Pythagorean theorem to find the length of the following vectorsin R2.

(a) 〈4, 3〉

2.1 Vectors 59

(b) 〈2, 0〉(c) 〈−1,−2〉(d) 3 〈−1,−2〉(e) 〈0, 0〉

6. Extend the notation of the length of a vector to Rn by

length(v) =√

(v1)2 + (v2)2 + · · ·+ (vn)2.

Find the length of the following vectors.

(a) 〈2, 4, 3〉(b) −2 〈2, 4, 3〉(c) 〈2, 0, 0〉(d) 〈−1, 1, 0,−2〉(e) 〈5, 5, 5, 5〉

7. A unit vector is a vector of length one. Find 3 distinct unit vectors inR2. Find 4 distinct unit vectors in R3.

8. Is the sum of any two unit vectors a unit vector? Give a proof orcounterexample.

9. Let v = 〈5,−3, 4〉. Find a scalar k in R such that kv is a unit vector.

10. If three corners of a parallelogram are (1, 1), (4, 2) and (1, 3), what areall the possible fourth corners? Draw two of them.

11. Let v = 〈1,−2, 1〉, w = 〈0, 1,−1〉. Find scalars k and j so that

kv + jw = 〈4, 2,−6〉 .

60

2.2 Introduction to Vector Spaces

Activities

1. Following is a list of some funcs that you worked with in the previoussection.

is_closed_va

is_commutative

is_associative_va

has_zerovec

has_vinverses

is_closed_sm

is_associative_sm

has_distributive1

has_distributive2

has_identityscalar

Write a description of what each func does, including the kind of ob-jects accepted, what is done to them, and the kind of object that isreturned.

2. (a) Construct in ISETL a set K = Z3 of scalars, a set V = (Z3)3 ofvectors, and four operations, va (vector addition) which is addi-tion mod 3 of elements in V , sm (scalar multiplication), which ismultiplication mod 3 of elements in V by elements in K, as (ad-dition of scalars), which is addition mod 3, and ms multiplicationof scalars, which is multiplication mod 3.

For example, you could write and store code such as:

K:={0..2}; V:={[x,y,z]| x,y,z in K};

va:=|v,u->[\left(v(i)+u(i)\right) mod 3 : i in [1..3]]|;

sm:=|k,v -> [\left(k*v(i)\right) mod 3: i in [1..3]]|;

as:=|k,j->(k+j) mod 3|; ms:=|k,j -> (k*j) mod 3|;

(b) Apply each of your funcs from Activity 1 to this system, [K,

V, va, sm, as, ms]. Create a table with the funcs as columnheadings and this system as the first row, and use the table tokeep track of which properties are satisfied this system.

2.2 Introduction to Vector Spaces 61

3. Repeat Activity 2 for each of the following systems. Add a new row toyour table for each system.

(a) K = Z5, V = (Z5)2, va is addition mod 5 of elements in V , smis multiplication mod 5 of elements in V by elements in K, sm ismultiplication mod 5 of elements in V by elements in K, as andms are addition and multiplication mod 5 respectively.

(b) K = Z3, V = {〈x, x, x〉 : x ∈ K}, va is addition mod 3 of elementsin V , sm is multiplication mod 3 of elements in V by elements inK, as and ms are addition and multiplication mod 3 respectively.

(c) K = Z5, V = {〈x, y〉 : x, y ∈ {1, 3}}, va is addition mod 5 ofelements in V , sm is multiplication mod 5 of elements in V byelements in K, as and ms are addition and multiplication mod 5respectively.

(d) K = Z5, V = {〈x, 0, 0〉 : x ∈ K}, va is addition mod 5 of elementsin V , sm is multiplication mod 5 of elements in V by elements inK, as and ms are addition and multiplication mod 5 respectively.

(e) K = Z5, V = {〈x, 1〉 : x ∈ K}, va is addition mod 5 of elementsin V , sm is multiplication mod 5 of elements in V by elements inK, as and ms are addition and multiplication mod 5 respectively.

(f) K = Z2, V = (Z2)5, va is addition mod 2 of elements in V , sm ismultiplication mod 2 of elements in V by elements in K, as andms are addition and multiplication mod 2 respectively.

(g) K = {0}, V = (Z3)2, va is addition mod 3 of elements in V , sm ismultiplication mod 3 of elements in V by elements in K, as andms are ordinary addition and multiplication respectively.

(h) K = Z7, V = (Z7)1, va, as are addition mod 7, sm and ms aremultiplication mod 7.

(i) K = Z5, V = {〈x, y〉 : x, y ∈ {0, 2, 4}}, va is addition mod 5 ofelements in V , sm is multiplication mod 5 of elements in V byelements in K, as and ms are addition and multiplication mod 5respectively.

(j) K = Z5, V = (Z2)3, va is addition mod 5 of elements in V , sm ismultiplication mod 2 of elements in V by elements in K, as andms are addition and multiplication mod 5 respectively.


(k) K = Z3, V = (Z3)3, va is addition mod 3 of elements in V , smis defined by k 〈x, y, z〉 = 〈0, 0, 0〉, as and ms are addition andmultiplication mod 3 respectively.

(l) K = Z5, V = (Z5)2, va is addition mod 5 of elements in V , smis defined by k 〈x, y, z〉 = 〈x, y, z〉, as and ms are addition andmultiplication mod 5 respectively.

4. Which systems from Activities 2 and 3 satisfy all ten properties fromActivity 1? Can you conjecture conditions on K = Zp, (Zq)

n, and thefour operations so that such a system will satisfy all ten properties?

5. Here is a list of some more systems [K, V, va, sm, as, and ms]. Whichof these systems satisfy all of the properties in Activity 1? Note: Mostof the following systems can only be constructed and run in VISETL(Virtual ISETL). This means that all of your work must be done byhand and in your mind.

(a) K = {−1, 1}, V = (K)3, va = ordinary component-wise multipli-cation, sm is ordinary component-wise multiplication, as, ms areordinary addition and multiplication respectively.

(b) K = R, V = R2, va is ordinary component-wise addition, and smis ordinary component-wise multiplication, as, ms are ordinaryaddition and multiplication respectively.

(c) K = R, V = R2, va is ordinary component-wise addition, and smis defined by k 〈x, y〉 = 〈kx, 3ky〉, as, ms are ordinary additionand multiplication respectively.

(d) K = R, V = R − {0}, va is defined by 〈x〉 + 〈y〉 = 〈xy〉, smis defined by k 〈x〉 =

⟨xk⟩, as, ms are ordinary addition and

multiplication respectively.

(e) K = R, V = {〈x, 0, x〉 : x ∈ R}, with va, sm, as, and ms definedas usual for R3.

6. Write a func is vector space that accepts a set K of scalars, a setV of vectors, and four operations va, sm, as, and ms defined on Vand K, and tests whether all of the properties listed in Activity 1 aresatisfied. Your func should return true if the system satisfies all tenproperties, and false if it fails to satisfy one or more property. Testyour func on some of the systems defined in Activity 2.


Discussion

In the activities at the beginning of this section you constructed severalmathematical systems and examined their properties. More specifically, youconstructed certain sets of vectors and sets of scalars, defined operations onthem and studied various properties of these sets under the defined opera-tions.

The ten properties listed in Activity 1 are satisfied by many importantmathematical systems. Rather than study each system separately, we aregoing to collectively consider all systems that satisfy these ten properties.We begin with the definition of such systems.

Definition 2.2.1. A set V of objects called vectors, together with the binaryoperations of vector addition and scalar multiplication is said to be a vectorspace over a field of scalars K if for all u,v, and w in V and all k, j in K thefollowing axioms are satisfied:

Axiom 1: u + v ∈ V (closure under vector addition).

Axiom 2: u + v = v + u (commutativity of vector addition).

Axiom 3: (u + v) + w = u + (v + w) (associativity).

Axiom 4: There is a vector 0 ∈ V such that v + 0 = v (zero vector).

Axiom 5: For each v ∈ V there is a unique element (−v) ∈ V such thatv + (−v) = 0 (vector inverses).

Axiom 6: kv ∈ V (closure under scalar multiplication).

Axiom 7: (kj)v = k(jv)(associativity of scalar multiplication).

Axiom 8: k(u + v) = ku + kv (first distributive law).

Axiom 9: (k + j)v = kv + jv (second distributive law).

Axiom 10: There is an element 1 ∈ K such that for every v in V , 1v = v(identity scalar).

We digress for a moment to discuss this “field of scalars” mentioned inthe definition. Scalars are just numbers, but what is a field? A field is aset of objects (usually numbers), together with two operations (addition and


multiplication) defined on the set that collectively satisfy many propertiesthat you have seen in your previous work with the real number system. Thatis,a field has all the standard properties of the real numbers including closureunder both operations, operations, additive and multiplicative identities andinverses, and properties such as commutativity, associativity and the dis-tributive laws. There are both finite and infinite fields. R is obviously aninfinite field, as are the rational numbers Q, and the complex numbers, C.However, Z is not a field. Why not? Is {0} a finite field? It turns outthat if p is a prime number, then Zp forms a finite field under the opera-tions of addition and multiplication mod p. The system Zm is not a fieldfor m not prime, because (among other reasons) not all elements in Zm havemultiplicative inverses.

We will not in general worry about the specific details of a field in thiscourse. Henceforth we will generally restrict our scalars to one of the fieldsQ, R, C, or Zp. In each case, the operations (as,ms) of addition and mul-tiplication of scalars are henceforth understood to be ordinary addition andmultiplication or addition and multiplication mod p, so we no longer need tospecify them.

Finite Vector Spaces

We now generalize from some of the systems you worked with in the activitiesto find examples of vector spaces. In cases where we do find a vector space,we will prove that fact. Where they do not, we will investigate the vectorspace axioms that are violated. The examples we will consider fall naturallyinto two types—finite and infinite vector spaces.

Your work in the Activities should have convinced you that finite sys-tems such as K = Z3, V = (Z3)3, component-wise addition mod 3, andcomponent-wise multiplication mod 3, or K = Z5, V = (Z5)2, with the cor-responding operations defined mod 5 do satisfy all ten axioms of a vectorspace. In fact, you may have conjectured the following theorem.

Theorem 2.2.1. For any positive integer n, and any prime p, (Zp)n forms

a vector space over Zp.

Note that in our theorem we have not mentioned the operations va, sm,as, or ms. Why not? Your work in the Activities should have convinced youthat there is a natural choice for these operations. In order for the system to


form a vector space, the operations will be done mod p. The theorem doesrequire that p be prime; (Zm)n is only a vector space when Zm is a field.

Proof. For a particular p and n we could always use ISETL to check allten axioms. However, the theorem holds for all primes p, so we will notspecify a particular one. We prove the theorem for the case n = 2 and leavethe generalization to any n for Exercise 16. Our proof consists of runningthrough the axioms for a vector space, and citing appropriate properties ofaddition and multiplication mod p which you learned about in Chapter 1.

closure: v + w = 〈(v1 + w1) mod p, (v2 + w2) mod p〉 ∈ (Zp)2 since the re-

mainder of vi + wi is always between 0 and p− 1.

commutativity:

v + w = 〈(v1 + w1) mod p, (v2 + w2) mod p〉= 〈(w1 + v1) mod p, (w2 + v2) mod p〉

= w + v.

associativity: Since mod p addition is associative, the component-wise modp addition is also associative.

zero vector: The vector 0 = 〈0, 0〉 ∈ (Zp)2 and

v + 0 = 〈(v1 + 0) mod p, (v2 + 0) mod p〉 = 〈v1, v2〉 = v.

vector inverses: The inverse of v = 〈v1, v2〉 is 〈p− v1, p− v2〉. Why?

closure: kv = 〈(kv1) mod p, (kv2) mod p〉 ∈ (Zp)2.

associativity: The associativity of multiplication mod p is inherited fromthe integers. Component-wise multiplication mod p is therefore asso-ciative.

distributive law 1:

k(v + w) = k 〈(v1 + w1) mod p, (v2 + w2) mod p〉= 〈k(v1 + w1) mod p, k(v2 + w2) mod p〉

= 〈(kv1 + kw1) mod p, (kv2 + kw2) mod p〉= 〈kv1, kv2〉+ 〈kw1, kw2〉


distributive law 2:

(k + j)v = 〈((k + j)v1) mod p, ((k + j)v2) mod p〉= 〈(kv1 + jv1) mod p, (kv2 + jv2) mod p〉

= 〈(kv1) mod p, (kv2) mod p〉+ 〈(jv1) mod p, (jv2) mod p〉= kv + jv

identity scalar: Clearly 1 ∈ Zp and 1v = v.

In the Activities you discovered some finite vector spaces that were notof the form (Zp)

n. For example, the system in Activity 3(b) where K = Z3,V = {〈x, x, x〉 : x ∈ K} forms a vector space. Note the all the vectors inthis space have identical components, and all these vectors are also in thevector space (Z3)3. So, to determine whether or not this V is a vector space,we do not need to check all ten axioms. Axioms 2, 3, 7, 8, 9, and 10 areautomatically true for this subset V since they are true for all vectors in(Z3)3, and scalars in K. Closure axioms 1 and 6 are fairly easily checkedsince adding two vectors with identical components must result in a vectorof identical components, and multiplying 〈x, x, x〉 by any element of Z3 willresult in a vector with identical components. It is clear that 〈0, 0, 0〉 is thezero vector in V and that the vector inverse 〈3− x, 3− x, 3− x〉 of 〈x, x, x〉is also in V , and thus axioms 4 and 5 are also satisfied. What other subsets of(Zp)

n did you find to be vector spaces? Can you think of additional examplesthat were not in explored in the Activities?

Some of the finite systems you worked with in the Activities were notvector spaces. For example, the system in Activity 3(l) where K = Z5, V =(Z5)2, and k 〈x, y, z〉 = 〈x, y, z〉 is not a vector space. To determine this,we do not need to check the first 5 axioms because they only involve vectoraddition, and our theorem guarantees that this system satisfies the vectoraddition axioms. We do need to check axioms 6 through 10 since they allinvolve some form of scalar multiplication. Is this system closed under scalarmultiplication? Does the theorem guarantee that? What is the identityscalar? Is there more than one in this case? Which of the distributive lawsdoes not hold?

In the activities you also learned that the system K = {0}, V = (Z3)2

is not a vector space. Why not? Which axiom does it fail to satisfy? If we


change K to Z3 would the system be a vector space? What if we change Kto Z2?

Why is the system K = Z5, V = {〈x, y〉 : x, y ∈ {1, 3}} not a vectorspace? How many axioms are failed? Would changing K correct the prob-lems? Does the system K = Z5, V = {〈x, y〉 : x, y ∈ {0, 2, 4}} fail the sameaxioms or different ones? Can you “fix” K so that these systems will bevector spaces?

Infinite Vector Spaces

Now we turn our attention to infinite vector spaces. Consider V = R2, withvector addition and scalar multiplication defined by the ordinary component-wise operations. Is V a vector space over the real numbers? Is R3 a vectorspace? Rn? We answer these questions with a theorem.

Theorem 2.2.2. Let n be a positive integer. The space Rn of ordered n-tuples with components from R is a vector space over R.

Proof. We cannot use ISETL to prove this theorem (why not?), but we notethat many of the vector space axioms are true as a consequence of propertiesof the real numbers. We only need to check the component-wise applicationof these properties. We now prove a few of the axioms for n = 2 and leavethe rest to Exercise 17.

closure: v + u = 〈v1, v2〉+ 〈u1, u2〉 = 〈v1 + u1, v2 + u2〉 ∈ R2

commutativity: Exercise 17 .

associativity:

(v + u) + w = 〈v1 + u1, v2 + u2〉+ 〈w1, w2〉= 〈v1 + u1 + w1, v2 + u2 + w2〉

= 〈v1, v2〉+ 〈u1 + w1, u2 + w2〉 = v + (u + w)

zero vector: Exercise 17.

inverses: If v ∈ R2 then −v = 〈−v1,−v2〉 ∈ V , and v +−v = 〈0, 0〉.

closure: Exercise 17.


associativity: Exercise 17.

distributive law 1:

k(v + u) = k(〈v1, v2〉+ 〈u1, u2〉)= k 〈v1 + u1, v2 + u2〉

= 〈kv1 + ku1, kv2 + ku2〉= 〈kv1, kv2〉+ 〈kv1, ku2〉 = kv + ku

distributive law 2: Exercise 17.

identity scalar: 1 ∈ R, and 1v = 1 〈v1, v2〉 = 〈1v1, 1v2〉 = 〈v1, v2〉 = v.

Since Rn is a vector space, it seems reasonable to believe that C will alsobe a vector space over R. In this case V = {〈a+ bi〉 : a, b ∈ R}. Scalarmultiplication is defined by k 〈a+ bi〉 = 〈ka+ kbi〉, and vector addition by〈a+ bi〉 + 〈c+ di〉 = 〈(a+ c) + (b+ d)i〉. You will verify the vector spaceaxioms in Exercise 5.

Is Cn a vector space over R? Is it a vector space over C? Is Qn avector space over Q? Why is Qn not a vector space over R? Which closureaxiom fails? In Activity 5, did you find infinite vector spaces that were notof the form (K)n for some field K? Is K = R, V = {〈x, 0, x〉 : x ∈ R}a vector space? How would you verify closure under vector addition andscalar multiplication? Knowing that the operations in this space are thesame as those for R3, do you need to check commutativity, associativity, orthe distributive laws? Does V contain a zero vector? What is the vectorinverse of 〈x, 0, x〉 in V ?

Does the system in Activity 5(d) satisfy the commutativity, associativityand distributive axioms? Since va and sm are not the usual operations onR, we need to check. What is the “zero” vector? What is the inverse of thevector 〈x〉? Is this system a vector space?

In Activity 5(a) you should have discovered that the system K = {−1, 1},V = (K)3, va = ordinary component-wise multiplication, is not a vectorspace. Why not? Which axiom does it fail to satisfy? Is the system inActivity 5(c) closed under scalar multiplication? Is this system a vectorspace?


The vectors in a vector space do not necessarily have to be tuples ofnumbers. Polynomials and functions defined on a set S can also play the roleof vectors. Vector spaces turn up in a wide variety of subjects. For example,vector spaces arise naturally in the study of solutions of systems of equations,geometry in 3-space, solutions of differential and integral equations, discreteand continuous Fourier transforms, quantum mechanics, and approximationtheory.

Note that we are sometimes sloppy and write things such as “Let V =(Z5)2 be a vector space” with no specific mention of the corresponding fieldor operations. Technically this is incorrect. Why? In order to be a vectorspace, we have to specify not only the set V of vectors, but also the set Kof scalars and the operations of vector addition and scalar multiplication.In many cases the scalars and operations are unambiguous, and so we justdescribe the set V of vectors. Henceforth, when K is not specified, you mayassume it is Zp if V is finite, or R if V is infinite. The operations va and smare the standard operations on (Zp)

n or Rn unless otherwise specified.

Non-Tuple Vector Spaces

There are two non-tuple vector spaces which we will discuss throughout thistext. We present them here by beginning with the following theorem.

Theorem 2.2.3. The set P(K) is a vector space over K with the standardpolynomial arithmetic. For any n, the set Pn(K) is a vector space over Kwith the standard polynomial arithmetic.

Proof. Left as an exercise (see Exercise 11).

This result is not very surprising because polynomials are really just tu-ples of numbers. Recall the definitions of pointwise operations on functions.If f and g are functions with the same domain and range and addition andmultiplication are defined on the range of f and g, then we can define f+g tobe the function x −→ f(x)+g(x) and kf to be the function x −→ kf(x). Notonly do the polynomials form a vector space, but they do so when interpretedas functions as well.

Theorem 2.2.4. The set PF(K) is a vector space over K with pointwiseaddition and scalar multiplication. For any n, the set PFn(K) is a vectorspace over K with pointwise addition and scalar multiplication.



The polynomial functions is actually only a small subset of a much largercollection, the infinitely differentiable functions on R. We make the followingdefinition.

Definition 2.2.2. The collection of infinitely differentiable functions on Rconsists of all functions f : R −→ R for which f and all of its derivatives aredefined on all of R. This set will be denoted by C∞(R).

It should be clear that PF(R) ⊂ C∞(R), but C∞(R) contains otherfunctions such as sin, cos, and x −→ ex. These functions also form a vectorspace.

Theorem 2.2.5. The set C∞(R) is a vector space over R with pointwiseaddition and scalar multiplication.


This last vector space is of great importance in the area of differentialequations and is also interesting because it does not have a natural tuplestructure.

Basic Properties of Vector Spaces

We conclude this section with the following theorem about vector spaces.

Theorem 2.2.6. Let V be a vector space over a field of scalars K. Then

1. The zero vector is unique.

2. Vector inverses are unique.

3. For any v ∈ V , 0v = 0.

4. Any scalar k times the zero vector is the zero vector (k0 = 0).

5. The scalar −1 times a vector v is the additive inverse of the vector.

6. If kv is 0 then either k = 0 or v = 0.

Proof. We prove (1) and (4) and leave the rest for Exercise 18.


(1): Suppose there are two zero vectors 0, and v in V . Consider the vectorsum 0 + v. We can compute it in two ways. Since 0 is an zero vector,0 + v = v. Since v is an zero vector, 0 + v = 0. Hence 0 = v.

(4): Using the second distributive law we know that for any v, k0 + kv =k(0 + v) = kv. Now adding −kv to both sides yields k0 = k0 + kv +−kv = kv +−kv = 0.

The combination of Properties (2) and (5) allows us to simplify our nota-tion and to speak of “vector subtraction”. That is, the meaning of v −w isnow clear. That is, v−w = v +−w. Our field of scalars K will also have anoperation of scalar subtraction defined as the addition of additive inverses. Afield will also have an operation of division (multiplication by multiplicativeinverses). For example, in Z54, 4/3 = 4 ∗ 2 = 3. Can we define an operationof vector division in a similar manner? Why or why not?

name vector space

Most of the remainder of our work in this course will be done within thecontext of a vector space. When we are writing ISETL code, it would behelpful to have an easy way of defining and referring to all the necessarypieces of a vector space. Carefully consider the code below. What does thiscode do? What kind of objects are accepted? What kind of objects arereturned? Any time you wish to work in ISETL with a finite vector space,we strongly suggest that you first apply name vector space and then workexclusively with the standard notation V, K, va, vs, as, ss, ms, ds,sm, ov, os, is. For now, you might try applying name vector space tothe vector space (Z3)3.

name_vector_space := proc(set_scal, op_add_scal,op_mult_scal,set_vec,op_add_vec, op_scal_vec_mult);

$ SETSV := set_vec; $ set of vectorsK := set_scal; $ set of scalars$ OPERATIONSva := op_add_vec; $ vector additionvs := |u,v -> choose w in V | v .va w = u|; $ vector subtract’nas := op_add_scal; $ add’n of scalarsss := |s,t -> choose r in K | r .as t = s|; $ subt’n of scalars


ms := op_mult_scal; $ mult. of scalarsds := |s,t -> choose r in K | r .ms t = s|; $ div’n of scalarssm := op_scal_vec_mult; $ scalar mult.$ DISTINGUISHED OBJECTSov := choose o in V | forall v in V | o .va v = v; $ Zero vectoros := choose o in K | forall s in K | o .as s = s; $ Zero scalaris := choose i in K | forall s in K | i .ms s = s; $ Unit scalar

write"Vector space objects defined: ","\n","\t","V, K, va, vs, as, ss, ms, ds, sm, ov, os, is";

end proc;

Exercises

1. Let V be a set consisting of the single vector v, and let K = R. Letvector addition be defined by v + v = v and scalar multiplication bykv = v. Is V a vector space? If so, prove this. If not, list the axiomsthat are not satisfied by this system.

2. Prove that V = {〈a, 0, 0〉 : a ∈ Zp} forms a vector space over Zp.

3. Show that the line through the origin in R3 in the direction 〈a, b, c〉 is avector space. That is, show that {〈ta, tb, tc〉 : t ∈ R} is a vector spaceunder the usual operations on vectors in R3.

4. Show that the plane {〈x, y, z〉 : x, y, z ∈ Randax + by + cz = 0} is avector space. (Vector addition and scalar multiplication are the usualoperations defined in R3.)

5. Verify that Cn is a vector space over R.

6. Let V = {〈x, y〉 : x, y ∈ R, x ≥ 0}. Determine whether or not Vforms a vector space over R under the usual operations of addition andmultiplication for R2.

7. Let V = R3. Determine whether or not V forms a vector space over Runder the usual operation of addition for R3, if scalar multiplication isdefined by k 〈x, y, z〉 = 〈x, ky, z〉.


8. Let V = R2. Determine whether or not V forms a vector space over Runder the usual operation of addition for R2, if scalar multiplication isdefined by k 〈x, y〉 = 〈0, 0〉.

9. Let V = R2. Determine whether or not V forms a vector space over Rif vector addition is defined by

〈x1, y1〉+ 〈x2, y2〉 =⟨(x5

1 + x52)1/5, (y5

1 + y52)1/5

⟩,

and scalar multiplication is defined by

k 〈x, y〉 =⟨k1/5x, k1/5y

⟩.

10. Consider the set P3 = {a0+a1x+a2x2+a3x

3 : a0, a1, a2, a3 ∈ R a3 6= 0}be the set of all polynomials of degree three with coefficients in R. Showthat P3 does not form a vector space over R under polynomial additionand scalar multiplication.

11. Prove Theorem 2.2.3.

12. Generalize the result of previous exercise. That is, show that Pn(R) =set of all polynomials of degree n or less, forms a vector space over R.



15. Does the set of all real-valued discontinuous functions on S form avector space over R under pointwise addition and scalar multiplication?Why not?

16. Generalize the proof of Theorem 2.2.1 for n ≥ 3.

17. Complete the proof of Theorem 2.2.2.


19. Let V be a vector space. Prove that for every u,v ∈ V there is a uniquevector w ∈ V such that w + v = u. How does this property relate tothe operation of vector subtraction?

74

2.3 Subspaces

Activities

1. Use the ISETL func subset on the pairs below to determine when Wis a subset of V .

(a) W = {〈x, 1, 0〉 : x ∈ Z5}, V = (Z5)3.

(b) W = {〈x, y〉 : x, y ∈ Z3}, V = (Z5)2.

(c) W = {〈x, 0〉 : x ∈ Z5}, V = (Z5)3.

(d) W = V = (Z2)4.

2. Write an ISETL func is subspace that accepts a set W and a vectorspace (that is [K, V, va, sm]). The action of your func is to determinewhether or not W is a nonempty subset of V , and whether W is alsoa vector space over K using va and sm. Test your func on each of thesystems below.

(a) W1 = {〈1, 2〉 , 〈2, 1〉 , 〈0, 0〉}, V = (Z3)2

(b) W2 = {〈0, 0, x〉 : x ∈ Z3}, V = (Z3)3

(c) W3 = {〈x, y, z, w〉 : x, y, z, w ∈ Z3, x + y = 2, z + w = 1}, V =(Z3)4.

(d) W4 = {〈x, y〉 : x, y ∈ Z2}, V = (Z3)2.

(e) W5 = {〈x, x〉 : x ∈ Z5}, V = (Z5)2

(f) W6 = {〈1, 1, 1〉}, V = (Z2)3

(g) W7 = {〈0, 0, 0, 0〉}, V = (Z2)4

(h) W8 = {〈x, 3, z〉 : x, z ∈ Z5}, V = (Z5)3.

(i) W9 = {〈x, y, 0〉 : x, y ∈ Z5}, V = (Z5)3.

(j) W10 = {〈x, y, z〉 : x, y, z ∈ Z5, x+ y = z}, V = (Z5)3.

(k) W11 = {〈x, y〉 : x, y ∈ Z5, x+ y = 0}, V = (Z5)2.

(l) W12 = W5 ∩W11, V = (Z5)2.

(m) W13 = W9 ∪W10, V = (Z5)3.

2.3 Subspaces 75

3. Find a subspace of (Z5)3 that is not W8,W9, or W10. Use your func

is subspace to verify that your subset is a subspace.

4. Write an ISETL func is subspace2 that accepts a set W and a vectorspace [K, V, va, sm]. The action of your func is to determine whetheror not W is a nonempty subset of V , and whether or not W satisfiesthe vector space axioms 1,4,5, and 6. Test your func on the 13 systemsin Activity 2.

5. Compare your results from Activities 2 and 4. For which systemsdo both funcs return true? For which systems do both funcs re-turn false? Can you make a conjecture about the equivalence ofis subspace and is subspace2?

6. Write an ISETL func that accepts as inputs a set W and a vectorspace [K, V, va, sm]. The action of your func is to determine whetheror not W is a nonempty subset of V , and whether for all k ∈ K andw1,w2 ∈ W , kw1 + w2 ∈ W . Test your func on the systems given inActivity 2.

7. Compare the your results from Activities 2 and 6. For which systemsdo both funcs return true? For which systems do both funcs returnfalse? Can you make a conjecture about the equivalence of these twofuncs?

Discussion

In these activities you explored subsets of vector spaces. In each caseyou worked with a subset of vectors from a vector space V over a field K,and you used the same operations of vector addition, scalar multiplication,and addition and multiplication of scalars that were defined for V and K.Sometimes this subset formed a vector space itself, and sometimes it did not.There is no general rule that would allow you to determine by inspectionalone when such a subset will form a vector space, but we can make thefollowing definition.

Definition 2.3.1. Let [K, V, va, sm] be a vector space over the field K, andlet W be a nonempty subset of V . If [K,W, va, sm] is again a vector spaceover K, then we say that W is a subspace of V .


Note that in order to be a subspace, W must first be a nonempty subsetof the vectors in V , and W must also satisfy all of the vector space axiomsusing the operations va and sm as they were defined for V over K. So,although the set of vectors W = (Z2)2 is a subset of V = (Z3)2, and thesystem [Z2,W, ∗2,+2] is a vector space, W is not a subspace of V . Why not?There are two problems here: the vectors in V and W are defined over twodifferent fields, and vector addition and scalar multiplication are done mod2 in W whereas they are done mod 3 in V . We could of course use mod 3arithmetic in W , but under these operations W will not be a vector space.Why not? Which vector space axioms are not satisfied by [Z2,W, ∗3,+3] ?

Now consider the vector space R3 with the usual operations of vectoraddition and scalar multiplication, and the subset W = {〈x, y, z〉 : x + 2y +3z = 0}. Is W a subset of R3? Does W have a geometric interpretation? IsW itself a vector space?

Determination of Subspaces

One way of answering that last question is to check each of the ten vectorspace axioms for the system [R,W, ∗,+]. However this is much more workthan is really necessary. Since the operations of vector addition and scalarmultiplication are exactly the same for both R3 and W , we do not needto recheck all of the vector space axioms for W . In fact, W will “inherit”commutativity, associativity, the distributive laws, and the scalar identityfrom R3. Why? Which axioms does this allow us to avoid checking? Whichaxioms do we still need to check?

Your work in Activities 4 and 5 should have convinced you that we needonly check four axioms—Axioms 1, 4, 5, and 6. We now check these axiomsfor [R,W, ∗,+].

Axiom 1: Let w1 = 〈x1, y1, z1〉 and w2 = 〈x2, y2, z2〉 be arbitrary vectors inW . Then w1 + w2 = 〈x1 + x2, y1 + y2, z1 + z2〉 and (x1 + x2) + 2(y1 +y2) + 3(z1 + z2) = (x1 + 2y1 + 3z1) + (x2 + 2y2 + 3z2) = 0 + 0 = 0, sow1 + w2 ∈ W , and W is closed under vector addition.

Axiom 4: Since 0 + 20 + 30 = 0, the vector 0 = 〈0, 0, 0〉 is in W . We donot need to check that w + 0 = w. Why not?

Axiom 5: Let w = 〈x, y, z〉 ∈ W . Since w ∈ R3, there is a vector −w ∈R3 with w + −w = 0. We need to show that −w is in W . Since

2.3 Subspaces 77

x + 2y + 3z = 0,−(x + 2y + 3z) = −x + 2(−y) + 3(−z) = 0, so−w ∈ W .

Axiom 6: Let k ∈ R and w = 〈x, y, z〉 ∈ W . Since x + 2y + 3z = 0,k(x+ 2y + 3z) = kx+ 2ky + 3kz = 0, so kw = 〈kx, ky, kz〉 ∈W .

Thus W is in fact a subspace of R3. Recall that W has a familiar geometricinterpretation as a plane through the origin in 3-space. Can you find anothergeometric subspace of R3?

Suppose W2 = {〈x, y, z〉 : x + 2y + 3z = 2} is another plane in 3-space.How does W2 differ from W? Is W2 a subspace of R3? Which of Axioms1,4,5, or 6 does W2 fail to satisfy?

We can generalize these results in a theorem:

Theorem 2.3.1. A nonempty subset W of a vector space V over K is asubspace if and only if W is closed under the inherited vector addition andscalar multiplication, the zero vector is in W , and each vector w in W hasan vector inverse −w in W .

Proof. (=⇒) If W is a subspace of V over K, then W is itself a vector spaceand therefore satisfies all ten vector space axioms.

(⇐=) The proof of this is similar to our work above and is left for Exercise 6.

In Activities 6 and 7, you may have observed that it is not necessaryto check all four of these axioms separately. You may have conjectured thefollowing theorem.

Theorem 2.3.2. A nonempty subset W of a vector space V over K is asubspace if and only if for all w1,w2 ∈ W and k ∈ K, kw1 + w2 ∈ W .

Proof. (=⇒) Left as an exercise (See Exercise 7).

(⇐=) We give only a rough sketch of the proof, and leave the details forExercise 15. Use Theorem 2.3.1 so that we only need to verify four axiomsfor W . Assume kw1 + w2 ∈ W . If we choose k = 1, then we can easilyshow that W is closed under vector addition. Since W is nonempty, we canfind a vector w ∈ W and let w = w1 = w2. Then by letting k = −1, onecan show that 0 ∈ W . Still using k = −1, but letting w2 = 0, (which wenow know is in W ), one can show that vector inverses are in W . Finally,letting w2 be 0, and k,w, be arbitrary will show that W is closed underscalar multiplication.


Any vector space V will have at least two subspaces, the subspace V itself,and the zero subspace (consisting solely of the vector 0). Why are these bothsubspaces? Why are they called “improper” subspaces? Does every vectorspace necessarily have “proper” subspaces?

Is R2 a subspace of R3? Carefully re-read the definition of a subspace.Can you see why R2 is not a subspace of R3? Is W 2 = {〈x, y, 0〉 : x, y ∈ R}a subspace of R3 ? Note that the subspace W 2 “looks like” or “behaves”exactly like R2. We say that R2 and W 2 are isomorphic vector spaces, andthat W 2 is an “embedding” of R2 in R3. Are there other copies of R2 thatcan be embedded in R3?

Is R1 a subspace of R3? Of R2? Can you find a subspace W 1 of R3 thatis isomorphic to R1? How many different isomorphic subspaces of R1 arethere in R3? Can you find a subspace of Rn that is isomorphic to Rm for allm < n?


When we discussed the polynomial, polynomial functions and the infinitelydifferentiable functions, some subset relationships were presented. We arenow able to describe the relationship between these sets more clearly in thefollowing theorems.

Theorem 2.3.3. For any set of scalars K and n,m with n < m, the followingstatements are true:

• Pn(K) is a subspace of Pm(K);

• Pn(K) is a subspace of P(K).


Theorem 2.3.4. For any set of scalars K and n,m with n < m, the followingstatements are true:

• PFn(K) is a subspace of PFm(K);

• PFn(K) is a subspace of PF(K).


Theorem 2.3.5. The following statements are true:

2.3 Subspaces 79

• PFn(R) is a subspace of C∞(R);

• PF(R) is a subspace of C∞(R).


Exercises

1. Let V be a vector space. Prove that {0} is a subspace of V .

2. Let L be a line through the origin in R3. Prove that L is a subspaceof R3.

3. Show that the set of all points on the line y = mx+ b is a subspace ofR2 if and only if b = 0.

4. Show that the set of all points in the plane ax+by+cz = d is a subspaceof R3 if and only if d = 0.

5. Let W be the subset of P2(R) consisting of all polynomials of the formf(x) = a1x + a2x

2, a1, a2 ∈ R. Determine whether or not W is asubspace of P2(R).



8. Which of the following are subspaces of R3?

(a) W = {〈x, y, z〉 : x− z = 1, y + z = 2}.(b) W = {〈x, y, z〉 : xy = 0}.(c) W = {〈0, y, 0〉}.




12. Which of the following subsets of C∞(R) are subspaces of C∞(R)?


(a) {f ∈ C∞(R) : f(−1) = 0}(b) {f ∈ C∞(R) : f(0) = f(1)}(c) {f ∈ C∞(R) : f ≤ 0}(d) {f ∈ C∞(R) : f(x) = f(−x)}(e) {f ∈ C∞(R) : f(x2) = (f(x))2}(f) {f ∈ C∞(R) : f(x2) = 2(f(x))}(g) {f ∈ C∞(R) : f(x) = a}

13. LetW1 andW2 be subspaces of a vector space V . IsW1∩W2 a subspace?If so, prove it. If not, find a counterexample.

14. LetW1 andW2 be subspaces of a vector space V . IsW1∪W2 a subspace?If so, prove it. If not, find a counterexample.

15. Let W = {(x, y) : x2 + y2 ≤ 9} be a subset of R2. (W is a disk ofradius 3.) Is W a subspace of R2? Why or why not?

16. Is Q3 a subspace of R3? (What is K)?

17. Let m ≤ n. Find two distinct subspaces of Rn that are isomorphic toRm.

Chapter 3

First Look at Systems

In this chapter, you will certainly recognize ideasthat anyone would call algebra. We revisitsystems of equations—perhaps your high schooltext called them simultaneous systems—andexplore a couple of methods for finding thesolutions to these systems. You will find someinteresting procedures in the next sections andprobably some new interpretations for things youhave met before.

82

3.1 Systems of Equations

Activities

1. Let K = Z3, the set of integers modulo 3. Write a statement inISETL that determines whether the following tuples: [x, y, z] = [2, 1, 1],[1, 1, 1], [2, 2, 2], and [1, 0, 0] are or are not a solution of the equation

2x+ y + 2z = 1.

2. Let K = Z3. Construct a func in ISETL that accepts a sequence [x, y, z]of three elements of K as input; that substitutes the elements of thesequence into the respective unknowns of the equation 2x+y+ 2z = 1;and that returns true, if the substituted values result in the equationbeing true, or returns false, if the substituted values result in theequation being false. Use this func to find the solution set of theequation.

3. Use the func you wrote in the prior activity to construct the solutionset of the equation 2x+ y + 2z = 1 in K = Z3. In particular, you willwant to define the set in such a manner that you iterate through everypossible sequence of three elements in K (test every sequence over K)to identify all possible solutions.

4. Given K = Zp, where p is prime number, and given a linear equationin K such as

a1x1 + a2x2 + · · ·+ aqxq = c,

where ai ∈ K, i = 1, . . . , q, and c ∈ K, construct a func One eqn

that accepts the modulus p, the sequence [a1, a2, . . . aq] over K of co-efficients and the constant c as input, and that yields the set of allsolutions [x1, x2, . . . , xq] of the equation as output. Test your func onthe equation defined in Activity 1.

5. Let K = Z2. For the equations

x+ y + z = 1

x + z = 0.

3.1 Systems of Equations 83

Use the func One eqn you wrote in the last activity to determine thesolution set of the first equation. Then use the same func to determinethe solution set of the second equation. Find the intersection of bothsolution sets. What property do the elements of the intersection sethave? What is the solution set of these two equations taken as a singlesystem?

6. Let K = Z2. Construct a func in ISETL that accepts a sequence[x, y, z] of three elements as input; that substitutes the elements of thesequence into the respective unknowns of the equations

x+ y + z = 1

x + z = 0;

and that returns true, if the substituted values result in both equationsbeing true, or returns false, if the substituted values result in one orboth equations being false.

7. Use ISETL code to construct the solution set of the system of equationsgiven by

x+ y + z = 1

x + z = 0.

in K = Z2. In particular, you will want to define the set in sucha manner that you iterate through every possible sequence of threeelements in K (test every sequence over K) to identify all possiblesolutions.

8. Given two equations

a1x1 + a2x2 + · · ·+ aqxq = c1

b1x1 + b2x2 + · · ·+ bqxq = c2

in K = Zp, where p is a prime number, construct a func Two eqn thataccepts the modulus p, two sequences of coefficients, [a1, a2, . . . , aq] and[b1, b2, . . . , bq], and two constants, c1 and c2, as input, and that returnsthe set of all solutions [x1, x2, . . . , xq] of both equations as output.

84 CHAPTER 3. FIRST LOOK AT SYSTEMS

9. Given a system of three equations

a1x1 + a2x2 + · · ·+ aqxq = c1

b1x1 + b2x2 + · · ·+ bqxq = c2

d1x1 + d2x2 + · · ·+ dqxq = c3

in K = Zp, where p is a prime number, construct a func Three eqn thataccepts the modulus p, three sequences of coefficients, [a1, a2, . . . , aq],[b1, b2, . . . , bq], and [d1, d2, . . . , dq], and three constants, c1, c2, and c3,as input and that returns the set of all solutions [x1, x2, . . . , xq] to thesystem as output. Test your func on the system

x+ 2y + z = 1

2x+ y + 2z = 2

2x+ 2y + z = 1

over Z3. Describe the process for constructing such a func for anynumber of equations.

10. Let K = Z5. Answer the following set of questions in relation to thesystem of equations in Z5 given below.

3x1 + 2x2 + x3 = 2

x1 + 4x2 + 4x3 = 3

2x1 + x2 + 2x3 = 2.

(a) Find the solution of this system using the func Three eqn youwrote before.

(b) Interchange the first and second equations of the system. Find thesolution of this new system using the func Three eqn you wrotebefore. What do you observe?

(c) Multiply both sides of equation 2 by 3. Replace the second equa-tion by this new equation. Apply the func Three eqn to thistransformed system. What do you observe?

(d) Multiply both sides of equation 2 by 3. Then, add the modifiedversion of equation 2 to equation 1. Replace the second equationby this new equation. Apply the func Three eqn to this trans-formed system. What do you observe?


(e) What operations can you do to transform the equations of thesystem without changing its solution set?

11. Let K = Z5. Answer the following set of questions in relation to thesystem in Z5 given below.

2x1 + 3x2 + x3 = 3

x1 + 4x2 + 2x3 = 1

3x1 + x2 + 2x3 = 2.

(a) Apply the func Three eqn to find the set of sequences [x1, x2, x3]that are simultaneous solutions of all three equations.

(b) Multiply both sides of equation 2 by 3. Then, add the modifiedversion of equation 2 to equation 1. In particular,

R2′(“new eqn 2”) = R1 + 3R2.

Apply the func Three eqn to the system

2x1 + 3x2 + x3 = 3

R2′

3x1 + x2 + 2x3 = 2.

Compare the solution set of this system to the original.

(c) Add equation 1 to equation 3. In particular,

R3′(“new eqn 3”) = R1 +R3.


2x1 + 3x2 + x3 = 3

R2′

R3′.

Compare the solution set of this system to the original.


(d) Interchange rows 2 and 3. In particular,

R2′′ = R3′

R3′′ = R2′.


2x1 + 3x2 + x3 = 3

R2′′

R3′′.

Compare the solution set of this system to that of the original.

(e) Does the process outlined in parts (b),(c) and (d) change the so-lution set of the system? Why does the process described hereappear to be effective in helping us to identify the solution of theoriginal system?

12. Let K = Z5. Given the system of equations

3x1 + x2 + 4x3 = 1

x1 + 3x2 + 3x3 = 4

4x1 + x2 + 3x3 = 3,

find the solution set by hand using a process similar to what was out-lined in the prior activity. Verify your work by applying the func

Three eqn to both the original system and the simplified system youproduced by hand. What do you observe?

Discussion

Algebraic Expressions and Linear Equations

In previous courses in algebra, you spent a great deal of time working withalgebraic expressions. Do you remember the difference between an algebraic


expression and an equation? In some cases, you were asked to simplify ex-pressions by applying the distributive property; the exercise

Simplify the expression: 6x(x+ y)− 3(x2 − 2xy)

is such an example. Similarly, you were assigned problems in which you wereasked to combine like terms. Exercises such as

Simplify by combining like terms: 5bcd− 8cd− 12bcd+ cd

fit into this category and are probably very familiar to you. You also spentconsiderable time factoring polynomials like

4x3 + 27x2 + 5x− 3

25a2 − 20ab+ 4b2.

What was the purpose of these tasks?Although these exercises may have seemed pointless, they were designed

with several objectives in mind: in particular, you were being taught aboutthe concept of variable. What are the values that each variable can assumein algebraic expressions such as

6x(x+ y)− 3(x2 − 2xy)?

That is what values can you select for x and for y, which, when substitutedinto the expression, yield a single number answer?

On the other hand, if we take one of the algebraic expressions above, suchas 4x3 + 27x2 + 5x− 3, and set it equal to, say 4, which yields 4x3 + 27x2 +5x−3 = 4, we now have an equation. What happens if you substitute valuesfor x in this case? Is it always a true statement?

In a similar fashion, if we take two of the other expressions given above,say 5bcd− 8cd− 12bcd+ cd and 25a2− 20ab+ 4b2, and set them equal to oneanother, the resulting equation

5bcd− 8cd− 12bcd+ cd = 25a2 − 20ab+ 4b2

will be true only for appropriately selected sequences [a, b, c, d] of values fora, b, c, and d. Can you find some examples of values for a, b, c, and d suchthat the equation will be false? Can you find some examples of values fora, b, c, and d such that the equation will be true? The set of values which


an unknown, or sequence of unknowns, can assume in any given equationis called the solution set of the equation. In this section and throughoutthe remainder of this course, we will focus our attention on finding solutionsets of linear equations and systems of linear equations. Do you recall thedifference between a linear equation and one that is not linear? Can you givean example of each?

A linear equation is any equation of the form

a1x1 + a2x2 + · · ·+ aqxq = c,

where a1, a2, . . . , aq, and c are constants in K, where K is the set of realnumbers or the set Zp, with p prime, and x1, x2, . . . , xq are unknowns.Equation such as 3x2

1 + 4x22 = −2, 3y sin(y) = −1, or 2x4− 4xy+ 5y = 7 are

not linear equations.

Definition 3.1.1. Let K be a field. A linear equation with coefficients inK is of the form a1x1 + a2x2 . . . + amxm = c, where ai ∈ K, i = 1, 2, . . . ,mdenote the coefficients, xi, i = 1, 2, . . . ,m represent the unknowns, and c ∈ Kis a constant.

Definition 3.1.2. A sequence [s1, s2, . . . , sm] is a solution of the equation,if, when si is substituted for xi, i = 1, 2, . . . ,m, the equation

a1s1 + a2s2 + · · ·+ amsm = c

is true. The solution set of a linear equation is the collection of all suchsolutions.

In Activity 1, you were asked to determine whether the sequences [2, 1, 1]and [1, 1, 1], [2, 2, 2], [1, 0, 0] are solutions of the equation 2x + y + 2z = 1in K = Z3. It is convenient to remember here that all the equalities in theactivities where the variables are elements of a finite field are congruences,and that it is implicit in the notation Zk that all the operations have to bedone modulo k. For example, when we write 3x+2y = 4 where x and y are inZ7 we mean 3x+ 2y ≡ 4 (mod 7). What do you have to ask when you wantto know whether a sequence is an element of the solution set of an equation?By substitution of the sequences you were able to find the sequences that aresolutions to the equation. Can you find all the sequences that are solutionsto this equation?


What is the purpose of the func you wrote in Activity 2? How canyou find all the elements of the solution set of an equation? In Activity 3you answered this question for a particular equation and in Activity 4, youconstructed and tested a func that would return the solution set of a singlelinear equation in K = Zp, where p is prime; in particular, you were askedto write code that would input a sequence of coefficients and a constant of alinear equation, and then return the solution set of the equation by iteratingthrough every possible sequence in the set K.

If K is finite and small, as it was in the activities, it is possible to checkevery possible sequence of values for the unknowns of an equation. If K isinfinite, however, we cannot check every possible sequence. For example, ifK = R, the set of real numbers, we cannot find the solution set of a linearequation by checking every sequence of possible values for the unknowns,because there are infinitely many possibilities to check. Instead, we try todetermine the solution set by transforming the original equation into a sim-pler but equivalent equation, that is, an equation that has the same solutionset, whose solution can be easily identified. For linear equations, this in-volves isolating an unknown variable. For example, given an equation like2x+3 = 11, what are the transformations you would do to find an equivalentequation which tells you directly the solution to the original equation? Whatproperties do you use to transform the equation into an equivalent one? Howdo we know that the method for transforming the equation yields each andevery solution?

Forms of Solution Sets

In the case of a linear equation of a single variable, we know that, if it hassolutions, there is exactly one solution. For a linear equation of more thanone variable, say 4x − y = 5, this is not the case. We can simplify theequation, however, by isolating y. What are the transformations you woulddo in this case? Can you identify the solution set of this equation? Can youidentify the geometric representation of the solution set? We will return tothis in the last section of this chapter.

If the variables of an equation are elements of a finite field, we can alwaysfind all the solutions in its solution set. If the variables are elements of aninfinite field, this is not always the case. Why?

The solution set of the equation we were considering before can be ex-pressed in a variety of ways. If we simply isolate y, we get the form y = 4x−5.


If x and y are in R, we can select any value for x, which, via the expression,yields a corresponding value for y. If we let x = t, then we get the parametricform

x = t

y = 4t− 5

of the equation. The solution set of this equation can be written in vectorform. For example we can interpret the equation 4x− y = 5 as consisting ofall vectors 〈a, b〉 in R2 whose components, when substituted, x = a, y = b,result in the equation being true. That is, the vectors that are solutionsof the equation. The expression for the x coordinate would be placed inthe first component, and the expression for y would be placed in the secondcomponent. The vector form of the solution set of 4x− y = 5 is given by theset

S = {〈t, 4t− 5〉 : t ∈ R}.

Can you express the vectors in S in terms of other vectors, using the opera-tions you learned in chapter 2? Is S a vector space?

We can also express the solution as a sequence. In this case, we areinterpreting the solution of the equation 4x− y = 5 as the set of all sequencecombinations [c, d] of elements in R such that if we let x = c and y = d, theequation is true. In this case, the solution set of the given equation assumesthe form

S = {[c, 4c− 5] : c ∈ R}.

Given a linear equation in four variables, say

3x1 + 2x2 − 4x3 + x4 = 5,

to obtain a solution set we would follow the same basic procedure as we didin simplifying the linear equation in two variables; in particular, we wouldtransform the the equation into an equivalent one where the first unknownis isolated:

x1 =5

3− 2

3x2 +

4

3x3 −

1

3x4.

Can you identify the equivalent equations involved in the transformation ofthis equation? In this solution, x2, x3, and x4 are free variables, because theycan assume any value, while x1 is dependent upon, or is determined by, the


values selected for x2, x3, and x4. Must x1 necessarily be the “dependent”variable? What is the vector form of the solution set of this equation? Thesequence form? What are the vector and sequence forms of the solution setof the general linear equation given in Definition 3.1.2?

In vector form, the solution set is given by

S =

{⟨5

3− 2

3t1 +

4

3t2 −

1

3t3, t1, t2, t3

⟩: t1, t2, t3 ∈ R

},

where S represents the set of vectors in R4 whose components satisfy theequation

3x1 + 2x2 − 4x3 + x4 = 5.

In sequence form, the solution set looks like

S =

{[5

3− 2

3t1 +

4

3t2 −

1

3t3, t1, t2, t3

]: t1, t2, t3 ∈ R

},

where S represents all combinations of values for the unknowns that aresolutions of the equation.

In general, for a single linear equation

a1x1 + a2x2 + · · ·+ aqxq = c

the solution set in vector form can be written as

S =

{⟨c

a1

− a2

a1

t1 −a3

a1

t2 − · · · −aqa1

tq−1, t1, t2, . . . , tq−1

⟩: t1, t2, . . . , tq−1 ∈ R} ,

where S is the set of vectors in Rq whose components are solutions of theequation; and the sequence form of the solution set is given by the set

S =

{[c

a1

− a2

a1

t1 −a3

a1

t2 − · · · −aqa1

tq−1, t1, t2, . . . , tq−1

]: t1, t2, . . . , tq−1 ∈ R} ,

where S represents all combinations of values for the unknowns that aresolutions of the equation. Note that any xi, i = 1..q, can be used as thedependent variable by solving as we did for x1. Is S a subspace of R?


Systems of Linear Equations

A system of equations is a collection of two or more linear equations. We areinterested in finding the solution set of systems of equations. In Activity 5 youused what you learned in previous activities to find the solution set of each ofthe equations that form the given system. Then you found the solution set ofthe system. Can you define what is the meaning of the solution set of a systemof equations? How is the solution set of the system related to the solution setof each of the equations? In Activity 6 you worked with the same system butnow you constructed a func that allowed you to check whether any sequenceof your choice would be a solution to the system and in Activity 7, you askedthe computer to determine all solutions by testing every possible sequence inthe func you constructed in Activity 6. You then generalized this process inActivity 8 by constructing a func that would accept the coefficients and theconstants corresponding to any pair of equations in Zp, p prime, and thatwould return the solution set of the system. Can you explain in your ownwords how this func works?

In Activity 9 you constructed and tested a func for solving a systemof three equations in Zp, p prime. You were then asked how you wouldgeneralize the procedure to construct a func for any number of equations inK = Zp. Can you use these ideas to describe a general process for findingthe solution of any system of m equations and n unknowns, where n andm are any integers larger than 1, and K is a finite filed? If K is finite, itis possible to construct the solution set of a system of linear equations bywriting code that checks every possible sequence of values for the unknowns.Why? If K is an infinite set, for example, the real numbers or the complexnumbers, such iteration is not possible. As with a single linear equation inR, it is necessary to devise a process that transforms the original systeminto a simpler, equivalent system whose solution set is readily identifiable.In order to do this you were asked in Activity 10 to perform some operationson a system of equations and to solve the transformed systems. What didyou find about the solution of each of those systems? The operations usedare called elementary transformations and as you found out, these operationstransform the system into a new system that has the same solution set.

Definition 3.1.3. Given a system of m equations and n unknowns over afield K, the original system can be transformed into a simpler, equivalentsystem by applying one or more elementary transformations, each of whichis listed below:


• Interchange the position of two equations.

• Multiply both sides of an equation by a nonzero constant.

• Add a multiple of one equation to another equation.

In Activity 11 you were asked to transform a system where K = Z5 intoequivalent systems, that is, into a system that has the same solution set. Itis important to remember that all the operations used while working withthis activity are done using modulo 5. In the first step, you multiplied thesecond equation by 3, and added the result to the first equation to producea “new”second equation. This operation involved two elementary transfor-mations : the first involved multiplying the second equation by the constant3; the second involved adding a the first equation to the second equation.What did you observe? Why might the form you obtained be considered“simpler?” If you did not have access to the func, how would you find thesolution of the system? Why do you suppose the solution set of the originalsystem and the simpler system are equal? In the second step, you performeda similar transformation to alter the third equation. Again, the resultingsystem yielded the same solution set. In the last step, you applied the thirdtype of elementary transformation: you interchanged equations 2 and 3. Thefinal form

2x1 + 3x2 + x3 = 3

4x2 + 3x3 = 0

2x3 = 1

is an equivalent system. Unlike the original system, it is possible to identifythe solution set by hand. Specifically, the last equation reveals that x3 = 3.If we substitute this value into the second equation, we see that

4x2 + 3(3) = 0,

from which it follows that x2 = 4. If we substitute x2 = 4 and x3 = 3 intothe first equation, we get

2x1 + 3(4) + 3 = 3,

which yields x1 = 4. Hence, the original system in K = Z5

2x1 + 3x2 + x3 = 3

x1 + 4x2 + 2x3 = 1

3x1 + x2 + 2x3 = 2,


has only one solution, namely x1 = 4, x2 = 4, x3 = 3. If the solution set iswritten in vector form, we have

S = {〈4, 4, 3〉},

and if it is expressed in sequence form, we get

S = {[4, 4, 3]}.

Observe that the simplified system

2x1 + 3x2 + x3 = 3

4x2 + 3x3 = 0

2x3 = 1

has no x1 term in the second equation and neither an x1 nor an x2 term inthe third equation. We could have added additional steps to simplify evenfurther. In particular, if we multiply each equation by a suitable nonzeroconstant, we eventually get a triangular-looking system

x1 + 4x2 + 3x3 = 4

x2 + 2x3 = 0

x3 = 3

said to be in echelon form. The entries corresponding to the x1 term in thefirst equation, the x2 term in the second equation, and the x3 term in thethird equation are called leading entries.

As you can see, the process of transforming a system of equations intoechelon form involves isolating variables: in particular, we isolated x3 andthen used it to isolate x2, whereby we then isolated x1. The three elementarytransformations, interchanging two equations, multiplying both sides of anequation by a constant, and adding a multiple of one equation to another, arethe tools by which we can transform a system of equations into an equivalentsystem that is in echelon form. Can you transform the system given inActivity 10 into an equivalent system which is in echelon form?

In Chapter 6 it will be shown that the process used to transform thesystem into its echelon form does not change the solution set of any systemof linear equations. Before we think about a proof, let’s consider the following


example in R,

2x1 − x2 + 3x3 + x4 = −2

3x1 + 2x2 − 4x3 + 2x4 = 3

−x1 + 4x2 − 2x3 + 5x4 = 1.

Based upon what you did in Activity 11, the first goal is to transform theoriginal system into an equivalent system in which the x1 term in the secondequation vanishes. What elementary transformation has been performed totransform the system into

2x1 − x2 + 3x3 + x4 = −2

7x2 − 17x3 + x4 = 12

−x1 + 4x2 − 2x3 + 5x4 = 1?

The next step might be to eliminate the x1 term from the third equation.What elementary transformation was used to transform the system into

2x1 − x2 + 3x3 + x4 = −2

7x2 − 17x3 + x4 = 12

7x2 − x3 + 11x4 = 0?

The last transformation left an x2 term in the third equation. What elemen-tary transformation was applied to transform the system into

2x1 − x2 + 3x3 + x4 = −2

7x2 − 17x3 + x4 = 12

16x3 + 10x4 = −12?

Is the next system equivalent to the given one? Why? In order to get thesystem into echelon form, we need a coefficient of 1 for each leading entry.How would we go about doing this?

x1 −1

2x2 +

3

2x3 +

1

2x4 = −1

x2 −17

7x3 +

1

7x4 =

12

7

x3 +5

8x4 = −3

4


How many leading entries does this system have?This system is now in echelon form, with leading entries provided by the

x1 term in the first equation, the x2 term in the second equation, and the x3

term in the last equation. Unlike the prior example, the last unknown will

not assume a single value. In particular, the last equation x3 +5

8x4 = −3

4is

a linear equation in two variables. If we isolate the x3 term, we get

x3 = −3

4− 5

8x4.

This means that x4 can assume any value; it is called a free variable. If welet x4 = t, we get

x3 = −3

4− 5

8t.

Substituting t for x4 and −3

4− 5

8t for x3 in the second equation yields

x2 −17

7

(−3

4− 5

8t

)+

1

7t =

12

7

x2 +51

28+

113

56t =

12

7

x2 = − 3

28− 113

56t.

If we substitute these expressions into equation 1, we find that

x1 −1

2

(− 3

28− 113

56t

)+

3

2

(−3

4− 5

8t

)+

1

2t = −1

x1 +15

14+

4

7t = −1

x1 = −29

14− 4

7t.

If we now write the solution in parametric form, we get

x1 = −29

14− 4

7t

x2 = − 3

28− 113

56t

x3 = −3

4− 5

8t

x4 = t,


where t is any real number. In vector form, the solution set can be writtenas

S =

{⟨−29

14− 4

7t,− 3

28− 113

56t,−3

4− 5

8t, t

⟩: t ∈ R

}.

Is S a vector space? The algebraic structure on vector spaces can be usedto express the solution set in vector form in different suitable ways. Can youthink of one such expression? How would you write the solution in sequenceform? How many solutions does this system have? We can find any specificsolution by selecting a value for t. If we substitute the representations givenfor x1, x2, x3, and x4 into each equation in the original system, all threeequations will be true, thereby proving that the proposed solution set, nomatter its specific form, is indeed the solution set of the original system ofequations. Is this always true?

In Activity 12 you transformed the given system using elementary trans-formations. How many leading entries did the system in echelon form have?What is the solution of that system? Is it possible for a system to have nosolution?

Let’s consider another example in R:

3x1 + 6x2 − 3x3 = 6

−2x1 − 4x2 − 3x3 = −1

3x1 + 6x2 − 2x3 = 10

What are the elementary operations used to transform the system into thefollowing equivalent systems?

3x1 + 6x2 − 3x3 = 6

4x2 − 15x3 = −9

3x1 + 6x2 − 3x3 = 10

and

3x1 + 6x2 − 3x3 = 6

4x2 − 15x3 = −9

0 = 4

The last equation corresponds to

0x1 + 0x2 + 0x3 = 4.


What is the meaning of the last equation of the system in echelon form?Does it have a solution? What does it mean in terms of the sequence ofvalues [x1, x2, x3]? The system has no solution.

The examples discussed here represent each of the possible types of so-lution sets of a system of equations: A system of equations in K = Zp hasa finite number of solutions whereas if it is over an infinite field, a systemof equations either has a unique solution, infinitely many solutions, or nosolution. We have exemplified this result in this section, later on, in chapter6 it will be proved. Although we have always considered the leading entriesas different from zero, in many systems they are zero. In such cases it wouldbe more convenient to interchange the appropriate equations first, so thatthe leading entries of the top equations are different from zero, and latertransform the system into echelon form.

Summarizing the Process for Finding the Solution of aSystems of Equations

If we are given a system of equations in K when K is finite, we can find thesolution set by substituting each possible sequence of values for the unknownsinto each equation. Those sequences for which the func returns true for eachequation in the system are elements of the solution set and vice versa.

If K is an infinite set, such as R, then it is impossible to check eachsequence of possible solutions. In this case, we must transform the originalsystem of equations into a simpler system whose solution set is equal to thatof the original system. There are three elementary transformations that canbe applied to a system without changing its solution set:

• Interchange the positions of two equations.

• Multiply both sides of an equation by a nonzero constant.

• Add a multiple of one equation to a multiple of another equation.

The goal of applying elementary transformations is to produce a system ofequations in echelon form, that is, a simpler system whose solution set iseasy to identify or to construct. To place a system in echelon form, we must


apply the following series of steps to a system such as

a11x1 + a12x2 + a13x3 + a14x4+ · · ·+ a1qxq = c1

a21x1 + a22x2 + a23x3 + a24x4+ · · ·+ a2qxq = c2

a31x1 + a32x2 + a33x3 + a34x4+ · · ·+ a3qxq = c3

...

ar1x1 + ar2x2 + ar3x3 + ar4x4+ · · ·+ arqxq = cr

1. Scale the leading coefficients to one, dividing by the coefficient of theleading variable.

2. Apply elementary transformations that eliminate x1 from equations 2and higher and replace those equations by the transformed ones.

3. Do the same to eliminate the x2 term from equations 3 and higher.

4. Do the same to eliminate the x3 term from equations 4 and higher.

5. Continue this process until the leading entries form a triangular pattern.

Once completed, the echelon system should look something like

x1 + b12x2 + b13x3 + b14x4+ · · ·+ b1qxq = d1

x2 + b23x3 + b24x4+ · · ·+ b2qxq = d2

x3 + b34x4+ · · ·+ b3qxq = d3

...

xr + br(r+1)xr+1+ · · ·+ brqxq = dr

In general, the leading entry in any given equation should occur in a columnto the right of the leading entry in the prior equation. Based upon the finalechelon form, the solution set of the equation can be found. If the system isin K = R, what would you expect the final echelon form of a system thathas an infinite number of solutions to be? Does such a system have any freevariables?

Exercises

The following exercises involve systems of equations where the variables areall in K = R unless otherwise stated.


1. Given the following equations verify if they are true for the valuesx = −2, x = 5, x = 0 and x = 1.

(a) x2 − 3x− 4 = 6

(b) x+ 7 = 5

(c) x2 + 2x+ 1 = (x+ 1)2

2. Give two examples of nonlinear equations.

3. Are the sequences [−1, 1, 0], [1, 1, 2], [0, 0, 1] and [1,−1, 0] solutions ofthe equation

2x+ y + 2z = 1

for x, y, and z in Z3?

4. Find the solution set of the system

x1 − x2 + 3x3 = 3

2x1 − x2 + 2x3 = 2

3x1 + x2 − 2x3 = 2

by transforming the system into an equivalent system, which is in ech-elon form.

5. Using elementary transformations, find the solution set of the system

3x1 + 6x2 − 3x4 = 3

x1 + 3x2 − x3 − 4x4 = −12

x1 − x2 + x3 + 2x4 = 8

2x1 + 3x2 = 8

by transforming the original system into echelon form.

(a) What are the leading entries? Are there any free variables?

(b) Does the system have one, infinitely many, or no solution? Whatis the relationship between the existence of free variables andwhether the system has one, infinitely many, or no solution?

(c) Express the solution set in vector, and sequence form. Explainthe difference between each way of expressing the solution set.


(d) Substitute the solution back into each equation of the originalsystem. After substitution, is each equation true?

6. Find the solution set of the system

x1 + 2x2 − x3 + 3x4 + x5 = 2

2x1 + 4x2 − 2x3 + 6x4 + 3x5 = 6

−x1 − 2x2 + x3 − x4 + 3x5 = 4

by reducing it into an equivalent system which is in echelon form. Whatare the leading entries? Continue performing elementary transforma-tions to the system to eliminate the variable x2 from the first andthird equations and the variable x3 from the first and second equa-tions. What do you observe? Can you continue performing elementarytransformations to the system without altering the leading entries?

The system you found is said to be in reduced echelon form. In areduced echelon form, we go beyond echelon form to get zeros in all ofthe coefficients above and below each leading entry. The elementarytransformations and the basic process are the same. The result is asystem that is even more simplified than echelon form.

(a) You have already identified the leading entries. Are there any freevariables?

(b) Does this system have a unique, infinitely many, or no solution?

(c) Express the solution set in vector, and sequence form.

(d) Using the general form given in either the vector or sequence formsof the solution set, create three different specific solutions, andsubstitute your results into the equations of the original system?What do you observe?

(e) Substitute the general form of the solution back into each equationof the original system. After substitution, is each equation true?

(f) Is the solution of the system a vector space?

(g) Write a new system which has the same expressions on the leftside of the equations but that has zeros as the constant termsto the right of the equal sign. Find the solution set of the newsystem. Is the solution set of this system a vector space?



2x1 − 4x2 + 12x3 − 10x4 = 58

−x1 + 2x2 − 3x3 + 2x4 = −14

2x1 − 4x2 + 9x3 − 6x4 = 44

by transforming the original system into reduced echelon form.

(a) What are the leading entries? Are there any free variables?

(b) Does the system have one, infinitely many, or no solution?

(c) Express the solution set in vector and in sequence form.

(d) Using the general form given in either the vector or sequence formsof the solution set, create three different specific solutions, andsubstitute your results into the equations of the original system.What do you observe?


(f) Is the solution set of this system a vector space?

8. Write a system in reduced echelon form such that its solution is givenby:

x1 = −3t+ 4

x2 = −2t+ 1.

The next series of steps are designed to transform the system into onepossible original system. Carefully perform each step.

(a) Take−1 times equation 2, and add the result to equation 1 to yielda “new” equation 1. Write down the new system that results fromperforming this transformation.

(b) Create an equation 3 by taking −2 times equation 2. (In this case,we think of equation 3 as 0x1 +0x2 +0x3 = 0. Hence, what we arereally doing is taking −2 times equation 2, and adding the resultto equation 3 to yield a “new” equation 3.) Write down the newsystem that results from performing this transformation.


(c) Take 2 times equation 1, and add the result to equation 2 to yielda “new” equation 2. Write down the new system that results fromperforming this transformation.

(d) Take 3 times equation 1, and add the result to equation 3 to yielda “new” equation 3. Write down the new system that results fromperforming this transformation.

(e) Multiply both sides of equation 1 by 3 to yield a “new” equation1. Write down the new system that results from performing thistransformation.

(f) Using the general solution, construct three different specific solu-tions, and substitute each of these into the “original” system youhave just created. What do you observe?

(g) Substitute the general form of the solution set given above intoeach equation of the resulting “original” system you have created.Is each equation true?

9. Suppose a system of 2 equations in 3 unknowns has a solution set whosevector form is given by

S = {〈−3t+ 1, 4t+ 2, t〉 : t ∈ R}.

Write the reduced echelon form that corresponds to this system. Then,apply three elementary transformations of your choice. Show that thegeneral form of the solution is a solution of the resulting “original”system you have created. Using three different elementary transfor-mations, create a second “original” system, and show that the generalform of the solution is a solution to the second system you have created.


x1 − x2 + 2x3 = 3

2x1 − 2x2 + 5x3 = 4

x1 + 2x2 − x3 = −3

2x2 + 2x3 = 1


(a) What are the leading entries? How many of them are there? Arethere any free variables?




(d) Using the general form of the solution set create three differentspecific solutions and substitute your results into the equations ofthe original system. What do you observe?



2x1 − 4x2 + 16x3 − 14x4 = 10

−x1 + 5x2 − 17x3 + 19x4 = −2

x1 − 3x2 + 11x3 − 11x4 = 4

3x1 − 4x2 + 18x3 − 13x4 = 17


(a) What are the leading entries? How many of them there are? Arethere any free variables?



(d) Using the general form of the solution set create three differentspecific solutions and substitute your results into the equations ofthe original system. What do you observe?


12. Consider the following homogeneous system of four equations in fourunknowns given by

x1 − 2x2 + x3 − 4x4 = 0

2x1 − 3x2 + 2x3 − 3x4 = 0

3x1 − 5x2 + 3x3 − 4x4 = 0

−x1 + x2 − 18x3 + 2x4 = 0.


(a) Using elementary transformations, find the solution set of the sys-tem.

(b) What are the leading entries? How many of them there are? Arethere any free variables?

(c) Does the system have one, infinitely many, or no solution?

(d) Express the solution set in vector and in sequence form. Is thisthe only solution? Why?

(e) Using the general form of the solution set, create three differentspecific solutions and substitute your results into the equations ofthe original system. What do you observe?

(f) Is the solution set of this system a vector space?

13. Consider the following system of four equations in four unknowns givenby

x1 − 2x2 + x3 − 4x4 = 4

2x1 − 3x2 + 2x3 − 3x4 = −1

3x1 − 5x2 + 3x3 − 4x4 = 3

−x1 + x2 − 18x3 + 2x4 = 5.

(a) Compare this system to the one given in the previous exercise.What are the similarities? What are the differences?

(b) IsS = {[−6,−6,−1, 3]}

a solution to the system. Why? Is this the only solution to thesystem? Why?

(c) Using elementary transformations, find the solution set of the sys-tem. What do you observe?

(d) The system of the previous exercise can be considered the homo-geneous system associated to this system. Why? Take the generalform of the solution set of the homogeneous system and add thissolution to the solution given by the sequence

S = {[−6,−6,−1, 3]}.

Is the sum a solution to the system? Why?


(e) Using elementary transformations, find the solution set of the sys-tem.

(f) What are the leading entries? How many of them there are? Arethere any free variables?

(g) Does the system have one, infinitely many, or no solution?

(h) Express the solution set in vector and in sequence form.

(i) Using the general form of the solution set, create three differentspecific solutions and substitute your results into the equations ofthe original system. What do you observe?

14. The reduced echelon forms of two equations in two unknowns can beclassified in one of three different ways:

x1 + 0x2 = c1

0x1 + x2 = c2 unique solution

x1 + bx2 = c1

0x1 + 0x2 = c2(6= 0) no solution

x1 + bx2 = c

0x1 + 0x2 = 0 infinitely many solutions

Classify, in a similar manner, the possible reduced echelon forms of asystem of three equations in three unknowns. Indicate which system(s)yield a unique solution, infinitely many solutions, or no solution.

15. Consider the general system of two equations in two unknowns givenby

ax+ by = e

cx+ dy = f.

(a) Determine conditions on a, b, c, d, e, and f that result in thissystem having a unique solution.


(b) Determine conditions on a, b, c, d, e, and f that result in thissystem having no solution.

(c) Determine conditions on a, b, c, d, e, and f that result in thissystem having infinitely many solutions.

16. A homogenous system of equations is any system of equations in whichall of the constant terms are zero. Show that x = 0, y = 0 is a solutionto the system

ax+ by = 0

cx+ dy = 0.

Prove that this is the only solution if and only if ad− bc 6= 0.

17. For a homogeneous system of n linear equations in n unknowns, provethat:

(a) The sum of two solutions to the system is a solution of the system.

(b) A multiple of a solution to the system is a solution of the system.

18. For a non-homogeneous system of n linear equations in n unknowns,prove that if a vector p is a particular solution of the system, andif h is a solution of the associated homogeneous system, that is, of ahomogeneous system that has the same coefficients as the given system,p + h is a solution of the non-homogeneous system.

19. Consider the following system of two differential equations in two un-knowns given by

x− y = x′

x+ 3y = y′.

Are the functions x, y given by

x(t) = e2t

y(t) = −e2t

elements of the vector space C∞(R)? Is the pair [x, y] a solution tothe system? Why? Explain in your own words what the solution to asystem of differential equations is.


20. Prove or disprove the following statements:

(a) The solution set of a system of equations in K = R is a vectorspace.

(b) The solution set of a homogeneous system of equations in K = Risa vector space.

109

3.2 Solving Systems Using Augmented

Matrices

Activities

1. Go back to Activity 11 in the prior section (see page 85). The op-erations given in (b), (c), and (d) transformed the original system ofequations into a simpler system. Go back and review the steps of thatprocess. What changed in each transformation? What remained thesame?

2. Go back to Activity 12 in the prior section (see page 86). Transform theoriginal system into echelon form by hand. After each step, write thenewly formed system on the left side of the page. On the right side ofthe page, write only the numbers corresponding to the coefficients andthe constant term of each equation. What do you observe? Can youread the solution of the system from the array? What do the elementson the right side of the page represent? Would it be possible to designa procedure that uses only the coefficients and the constants arrangedin an array when applying elementary transformations?

3. Let K = Z5. Consider the system given below:

3x1 + x2 + 4x3 = 1

x1 + 3x2 + 3x3 = 4

4x1 + x2 + 3x3 = 3

(a) Take the coefficients of each equation and form three sequences.

(b) Use the program matrix from tuple from the Matrix package inISETL to define an array of the coefficients of the equations of thesystem, or matrix, where the first row of the matrix, [a11, a12, a13],is the sequence of the coefficients from the first equation, that is,

[a11, a12, a13] = [3, 1, 4];

the second row of the matrix, [a21, a22, a23], is the sequence of thecoefficients from the second equation, that is,

[a21, a22, a23] = [1, 3, 3];


and the third row of the matrix, [a31, a32, a33], is the sequence ofthe coefficients from the third equation, that is,

[a31, a32, a33] = [4, 1, 3].

Then use the program display matrix(M) from the same packageto display the array. This array is called the coefficient matrix ofthe system.

(c) Take the constants of each equation and form a sequence. Use theprogram augment col(M,c) from the Matrix package to define anew array which includes the coefficients and constants from thesystem defined at the beginning of this activity. We call this arraythe augmented matrix of the system.

(d) Use the programs matrix row, matrix add, scale matrix andset matrix row from the Matrix package to create a programreduce matrix that will take the augmented matrix of a systemof equations, return its echelon form , as well as display all in-termediate steps, that is, the augmented matrix that results afterhaving applied a single elementary transformation.

(e) Write down the resulting system of equations. To do this, converteach row of the matrix into an equation, with the first columnentry set equal to the coefficient of x1, the second column entryset equal to the coefficient of x2, the third column entry set equalto the coefficient of x3, and the column entry after the vertical lineset equal to the constant on the opposite side of the equal sign.What is the solution of the system?

(f) Apply the func Three eqn to the system, and compare the resultyou obtained in part (c). What do you observe?

4. Let K = Z7. Given the system of equations,

4x1 + x2 + 3x3 + 2x4 = 1

x1 + 3x2 + 2x3 + 5x4 = 2

3x1 + 4x2 + 2x3 + x4 = 6,

follow a process similar to that described in the last activity to find thesolution set of the given system. Modify the program you wrote in the

3.2 Solving Systems Using Augmented Matrices 111

last activity, so that the final augmented matrix is in reduced echelonform. What is the solution of the system? Apply the func Three eqn

both to the original and the simplified systems to verify your results.

5. Let K = R. Given the following array,4 7 −1 08 8 2 −43 4 1 −3

,

write the corresponding system of equations. Solve the system. Doesthe system have one solution, multiple solutions, or no solution?

6. Let K = R. Consider the following system of equations

2x1 + 2x2 + 5x3 + x4 + 3x5 = 0

x1 + 2x2 + 4x3 + 6x4 + x5 = 0

3x1 + 4x2 + x3 + x4 + 4x5 = 0

x1 + 3x2 + 3x3 + 2x4 + 5x5 = 0.

(a) Apply the reduce matrix program to find the solution set of thissystem.

(b) How many free variables do you find? How many solutions doesthis system have?

(c) Is the solution set of this system a vector space? Use one of thetools you designed in Chapter 2

(d) Compare the coefficients of the last system with those of the fol-lowing one:

2x1 + 2x2 + 5x3 + x4 + 3x5 = 3

x1 + 2x2 + 4x3 + 6x4 + x5 = 2

3x1 + 4x2 + x3 + x4 + 4x5 = 5

x1 + 3x2 + 3x3 + 2x4 + 5x5 = 2.

What do you observe?

(e) Is the sequence

[160

61,−114

61,−30

61,40

61

]a solution to this system?


(f) Find the solution set of the system given in part (d). Comparethe solution of the system given in part (d) with the form of thesolution set of the homogeneous system given at the beginning ofthis activity. What do you observe? Describe the solution to part(d) in terms of the solution of the associated homogeneous system.

(g) Is the solution set of the system in part (d) a vector space?

7. Suppose a system of equations has the following matrix of coefficients,2 −1 31 −3 14 0 −3

.

Solve the system of equations associated with this matrix, if the con-stants of the system are given by the following sequences:

(a) [1, 0, 0]

(b) [0, 1, 0]

(c) [0, 0, 1]

What do you observe? Can you design a way to solve the three systemsat the same time?

8. Let K = Z7. Apply Three eqn to verify that the two systems of equa-tions,

x1 + x2 + x3 = 6

4x1 + 3x2 + 5x3 = 1

3x1 + 2x2 + x3 = 6

and

5x1 + 2x2 + 4x3 = 6

x1 + 3x2 + 3x3 = 0

3x1 + 2x2 + 4x3 = 2,

have the same solution set. Find the echelon form of the solution ofeach system. What do you observe?


9. Let K = Z7. Apply Three eqn to verify that the two systems of equa-tions,

x1 + x2 + x3 = 6

4x1 + 3x2 + 5x3 = 1

3x1 + 2x2 + x3 = 6

and

x1 + 2x2 + 3x3 = 1

2x1 + 3x2 + 5x3 = 4

3x1 + 4x2 + 4x3 = 6,

have different solution sets. Find the echelon form of each system.What do you observe?

10. Given a field K = Zp, where p is prime number, or K = R, and givena general system of linear equations, such as

a11x1 + a12x2+ · · ·+ a1qxq = c1

a21x1 + a22x2+ · · ·+ a2qxq = c2

a31x1 + a32x2+ · · ·+ a3qxq = c3

...

an1x1 + an2x2+ · · ·+ anqxq = cn,

where ai ∈ K, i = 1, . . . , q, and ci ∈ K, write the steps for finding thesolution set of a system using matrices as if you were being asked toexplain this process to a classmate having difficulty with linear algebra.

Discussion

Using Augmented Matrices

In the last section, we discussed various strategies for finding the solutionset of a system of linear equations. For a given system in Zp, p prime, you


constructed funcs that accepted a sequence of values and then tested whetherthe given sequence was a solution. For the same system, you used this functo identify all of the sequences that make up the solution set. You werethen asked to generalize this process by writing funcs for a system of twoequations and a system of three equations. What were the inputs and theoutputs for each of the funcs, Two eqn and Three eqn programs you wrote?How would you generalize this process for a system of n equations? Thesefuncs are limited however, because they can only be applied to systemsof equations defined over finite fields. Although such funcs would work intheory when K = R, this is not practical, because R has infinitely manyelements. To compensate for this, you learned how to apply elementarytransformations to the a system of linear equations over R to transformit into a simpler system whose solution set was both easy to identify andequivalent to the solution set of the original system. However, the process oftransforming the system through the use of elementary transformations canoften be cumbersome, despite its effectiveness. The purpose of this sectionis to introduce a means of streamlining the process.

In Activity 1, you analyzed the process of transforming the system to iden-tify those features that remained invariable and those that changed. Onceyou identified these features, in Activity 2 you were asked to repeat the pro-cedure of transformation of a system of equations in a different way. Whichprocedure seems to make the process of transformation easier? In Activity 3,you learned how to write a system of linear equations as an array of numbers,the augmented matrix associated with the given system. The augmented ma-trix is formed by augmenting the coefficient matrix—formed by placing thecoefficients of the unknowns in an array—with a column consisting of theconstant terms of the equations.

In an augmented matrix, such asa11 a12 a13 a14

a21 a22 a23 a24

a31 a32 a33 a34

a41 a42 a43 a44

,

the columns to the left of the vertical line correspond to the coefficients ofthe equations, and the column to the right of the vertical line corresponds tothe constants for each equation. (If it is clear that the matrix represents thecoefficients and constants of an associated system of equations, the verticalline is often dropped.) The first row of terms, a11, a12, a13, and a14, are


the coefficients and the constant of the first equation. a11, a12, a13 are thecoefficients of the unknowns, say x1, x2, and x3, and a14 is the constantwhich appears on the opposite side of the equals sign; in particular row 1corresponds to the equation

a11x1 + a12x2 + a13x3 = a14.

Each double digit subscript denotes the row and column position of eachentry. For example, the 12 that appears in a12 tells us that the entry cor-responding to a12 resides in the second column of the first row; a43 denotesthe entry that occupies the third column of the fourth row. In general, aijdenotes the entry in the jth column of the ith row. What is the form of theother equations associated with the augmented matrix shown above?

In the first activities, you discovered that elementary transformationschange the coefficients and the constants; the unknowns remain unchanged.In Activity 2, you were asked to think about designing a procedure for sim-plifying a system that would involve only the coefficients and the constantsof the system. In Activity 3, you were actually introduced to such a method-ology: you formed the augmented matrix corresponding to the given system;you applied elementary row operations to each row of the matrix to convert itinto echelon form; and you used the subsequent echelon form to identify thesolution set of the original system. You designed the program reduce matrix

to perform these steps after having accepted the coefficients and constant ofeach equation.

Let’s apply this procedure to a specific example in K = R, such as

x1 + 2x2 − x3 + 3x4 + x5 = 2

2x1 + 4x2 − 2x3 + 6x4 + 3x5 = 6

−x1 − 2x2 + x3 − x4 + 3x5 = 4.

First, rewrite the system in its associated augmented matrix: 1 2 −1 3 1 22 4 −2 6 3 6−1 −2 1 −1 3 4

.

Second, our goal is to transform this matrix into either its echelon or itsreduced echelon form. A matrix is in echelon form if:


1. Any rows consisting entirely of zeros are grouped at the bottom of thematrix.

2. The first nonzero element of each row is 1. This element is called aleading entry.

3. The leading entry of each row is positioned to the right of the leadingentry in the prior row.

4. All entries in the column below a leading entry are zero.

This process is known as Gaussian elimination. The nonzero entries of anechelon matrix create a diagonal configuration. Three examples of echelonmatrices are given below. What features do they have that define this form?1 2 −1 0 2 1

0 0 1 −3 1 −20 0 0 0 1 5

0 1 −1 2 3

0 0 1 −1 −40 0 0 0 1

1 0 2 10 1 −3 40 0 1 −10 0 0 1

The three matrices given below are not in echelon form. Can you explain

why? 1 −3 2 40 0 −1 20 1 −3 4

2 1 −3 4

0 1 −1 20 1 2 −3

0 1 6 2 −31 −2 3 −1 10 0 0 1 −20 0 0 −3 5


Reduced echelon form is basically the same as echelon form, except thatall of the column entries above, as well as below, a given leading entry mustbe zero. Three examples of reduced echelon matrices are given below.

1 5 0 2 00 0 1 9 00 0 0 0 10 0 0 0 0

1 0 4 0 00 1 2 0 00 0 0 1 00 0 0 0 1

1 −2 0 3 0 4

0 0 1 2 0 90 0 0 0 1 8

The three matrices given below are not in reduced echelon form. Can you

explain why? 1 0 0 5 30 0 1 0 30 1 2 3 7

1 0 4 2 60 1 3 2 −30 0 0 1 −20 0 0 0 1

1 0 2 0 30 0 0 0 00 1 2 0 70 0 0 1 3

In the prior section, you applied elementary transformations to transform

a system into echelon form. In trying to simplify an augmented matrix,what transformations did you use? You used these analogous elementaryrow operations to find the solution of the systems in Activities 4 and 5. Wewill apply these operations to simplify following the matrix. Can you identifythe operation done at each step of the transformation?


1 2 −1 3 1 22 4 −2 6 3 6−1 −2 1 −1 3 4

.

Step 1 Which elementary row operation were used to eliminate the column1 entries in rows 2 and 3?. These operations yield 1 2 −1 3 1 2

0 0 0 0 1 20 0 0 2 4 6

.

Step 2 What elementary row operation was used to yield 1 2 −1 3 1 20 0 0 2 4 60 0 0 0 1 2

.

Step 3 What elementary row operation was used to yield 1 2 −1 3 1 20 0 0 1 2 30 0 0 0 1 2

.

Is the last matrix is in echelon form? If so and if we wish to transformit into reduced echelon form, we must continue the process using theelementary row operations to eliminate the column entry above theleading entry in row 2, and to eliminate the nonzero entries in column 5above the leading entry in row 3. What operations have been performedin the next steps?

Step 4 They would give us 1 2 −1 0 −5 −70 0 0 1 2 30 0 0 0 1 2

.

Step 5 How can we obtain the following matrix which is in reduced echelonform? 1 2 −1 0 0 3

0 0 0 1 0 −10 0 0 0 1 2


Is the last matrix in reduced echelon form? Why? Once the matrixis in reduced echelon form we can identify the solution set. What arethe equations corresponding to this matrix? As you found out, theunknowns x4 and x5 are fixed; x2 and x3 are free variables; and x1 isdependent upon x2 and x3.

Echelon and reduced echelon form provide a convenient notation for iden-tifying the solution set of a system of equations. Elementary row operationsare the necessary tool that allows us to systematize the simplification processfor transforming a system of equations into an augmented matrix in echelonform.

Activity 6 had the objective to apply this new tool to help you solve anautonomous system and to relate the solution of a non-homogeneous sys-tem with its associated homogeneous system. Can the general solution ofa non-homogeneous system be written as the sum of a specific solution andthe general solution of the homogeneous system? If the solution set of thehomogeneous system given at the beginning of the activity is a vector space,what exactly must you show? If the answer is yes, of what vector space is thesolution set a subspace? Why isn’t the solution set of the non-homogeneoussystem a subspace? And Activity 7 intended to demonstrate that it is possi-ble to use the augmented matrix to solve several systems: the coefficients ofeach equation remain the same. In trying to solve all three systems simulta-neously, what would be the form of the resulting augmented matrix?

What did you find when working with Activities 8 and 9? What can yousay about the reduced echelon form corresponding to a given solution set?As you were able to verify, two systems of equations have the same solutionset if and only if they can be simplified to the same reduced echelon form. InActivity 8, where you verified that the two systems have the same solutionset, you found that the corresponding reduced echelon forms are equal. Onthe other hand, in Activity 9 you discovered that each system yielded adifferent reduced echelon form. What can you say about the solution set inthis case? What is the purpose of Activity 10? Does each reduced echelonform correspond to a single “original” system? How does your answer relateto Exercise 9 in the last section?

In this section you have been applying elementary operation to systemsof equations in order to find their solution set in a convenient computationalway. With practice you will find that you can combine several elementaryoperations into one step. For example, such a combined operation would be


the replacing of a row by the sum of the multiples of two rows, provided thatthe row replaced appears in the linear combination with a nonzero coefficient.This operation can be thought of in the following way: If you think of theeach row of the augmented matrix as a vector, the operation would be thesame as replacing one vector with the sum of the multiples of other twovectors. It will be very useful in later chapters.

The original system of m equations and n unknowns in R, and the corre-sponding system in echelon form are, as we have seen, equivalent. The systemin reduced echelon form is particularly easy to solve because the variable xiappears only in the ith equation. Furthermore, non-zero coefficients appearin all n rows or only in the first r rows. Since each xi appears in but onerow with unit coefficient, we can consider two cases: when the n rows havenon-zero coefficients, the system has a unique solution; when the non-zerocoefficients appear in r rows, the remaining n − r unknowns can be givenvalues arbitrarily, and the corresponding values of the xi can be computedaccordingly. The n− r unknowns corresponding to the indexes greater thanr can be considered parameters, or free variables, and thus the system hasan infinite number of solutions.

The paragraph above can be considered a non-formal proof of the follow-ing theorem that will be proved formally in Chapter 6:

Theorem 3.2.1. The system of simultaneous linear equations in K = R,

a11x1 + a12x2+ · · ·+ a1nxn = c1

a21x1 + a22x2+ · · ·+ a2nxn = c2

a31x1 + a32x2+ · · ·+ a3nxn = c3

...

am1x1 + am2x2+ · · ·+ amnxn = cm,

has a solution if and only if all solutions can be expressed in terms of n− rindependent parameters, or free variables. The number r is called the rankof the coefficient matrix associated with the system.

Summarizing the Process for Finding the Solution of aSystem of Equations Using an Augmented Matrix

As we observed in the prior section, we can find the solution set of a systemof equations in K = Zp, p prime, by direct substitution: if a sequence of


values for the unknowns returns true for each equation of the system, thenthat sequence of values is an element of the solution set. If K is an infiniteset, such as R, then it is impossible to check each sequence. In this case, wemust transform the original system of equations into a simpler system. Inthis section, we executed this process by applying elementary row operationsto the augmented matrix of a system of equations. The three elementaryrow operations—interchange the positions of two rows, multiply a row bya nonzero constant, and add a multiple of one row to another row—areanalogous to the three elementary transformations defined in the last section.We apply these operations to transform the original system into echelon orreduced echelon form. Both forms allow us to identify the solution set withouthaving to resort to direct substitution. To use an augmented matrix, we takea system, such as

a11x1 + a12x2 + a13x3 + a14x4 + · · ·+ a1qxq = c1

a21x1 + a22x2 + a23x3 + a24x4 + · · ·+ a2qxq = c2

a31x1 + a32x2 + a33x3 + a34x4 + · · ·+ a3qxq = c3

...

ar1x1 + ar2x2 + ar3x3 + ar4x4 + · · ·+ arqxq = cr,

form its associated matrix,a11 a12 a13 a14 . . . a1q c1

a21 a22 a23 a24 . . . a2q c2

a31 a32 a33 a34 . . . a3q c3...

......

......

......

ar1 ar2 ar3 ar4 . . . arq cr

,

and then apply elementary row operations until we get one of the echelonforms. The simplified system, which is equivalent to the original, will yieldeither a unique solution, infinitely many solutions, or no solution.

Exercises

All systems considered in these exercises are in R unless otherwise stated inthe exercise.

1. For each of the augmented matrices given in (a)–(e), determine whetherthe given matrix is in echelon form. If it is, write the corresponding


system of equations, and identify its solution set. If the matrix is notin echelon form, explain why.

(a) 0 1 2 1 30 0 1 −1 20 0 0 1 −4

(b)

1 −1 −3 −2 0 −10 1 0 0 0 40 2 −1 −5 −3 60 0 5 1 0 7

(c) 1 2 −1 0 0 3

0 0 1 −1 3 40 0 0 0 1 6

(d) 1 −1 2 1 3

0 1 −3 4 20 0 0 0 1

(e)

1 −1 2 1 30 0 0 0 00 0 2 −1 50 1 −3 2 −4

2. For each of the augmented matrices given in (a)–(f), determine whether

the given matrix is in reduced echelon form. If it is, write the corre-sponding system of equations, and identify its solution set. If the matrixis not in reduced echelon form, explain why.

(a) (1 0 0 −20 1 1 3

)


(b) 0 1 3 0 −12 0 −1 0 50 0 0 1 4

(c)

1 0 2 0 0 10 1 1 0 0 40 0 0 1 0 −40 0 0 0 0 0

(d) 1 2 −3 0 4 3

0 0 1 −1 −2 70 0 0 0 1 5

(e)

1 0 2 0 3 50 1 −3 0 2 −30 0 0 1 −2 00 0 0 0 0 0

(f) 1 −1 2 0 3 0 3 −5

0 0 0 1 −2 0 4 −20 0 0 0 0 1 −2 −2

3. Use augmented matrices and elementary row operations to find the

solution set of each of the following systems of equations. Identify theleading entries, the free variables, and the rank of the coefficient matrixassociated with the system. Write the solution in vector and parametricform.

(a)

3x1 + 6x2 − 3x3 = 6

−2x1 − 4x2 − 3x3 = −1

3x1 + 6x2 − 2x3 = 10


(b)

x1 − x2 + 2x3 + 4x4 = 0

2x1 + 3x2 − x3 + x4 = 0

−4x1 + 5x2 + 3x3 − 2x4 = 0

(c)

4x1 + 8x2 − 12x3 = 28

−x1 − 2x2 + 3x3 = −7

2x1 + 4x2 − 8x3 = 16

−3x1 − 6x2 + 9x3 = −21

(d)

2x1 − x2 + 3x3 + x4 = 12

3x1 + 2x2 − x3 + 4x4 = −3

(e)

x1 + 2x2 + x3 − 3x4 = −3

2x1 + 3x2 + x3 − 5x4 = 9

−2x1 + 4x2 − x3 + 3x4 = −2

−x1 − x2 − x3 + x4 = 1

4. For each system below, (a)–(c), find the values of h, k, and l that resultin each system of equations having a unique solution.

(a)

−2x+ y = h

8x− 4y = k

(b)

x+ 3y − z = h

x− y + 2z = k

−3x+ 4y = 1


(c)

2x− 6y = −3

−4x+ 12y = h

5. For the augmented matrices given in parts (a) and (b) below, find thevalues of r that result in the corresponding system of equations havinginfinitely many solutions.

(a) (1 4 2−3 r −1

)(b) (

2− r 1 0−1 3− r 0

)6. For the augmented matrix given below, find values for h and k that

result in the corresponding system of equations having no solution.(1 h 12 3 k

)7. Write two systems of equations that have the same solution set.

8. Given the following vector form of a solution set,

S = {〈2, 3s+ 2t,−s+ 2t− 3, 4− s, s,−2, t〉 : s, t ∈ R},

write the associated reduced echelon matrix.

9. Write all of the possible augmented reduced echelon matrices thatwould correspond to a system of two equations in two unknowns. De-note entries that can be a number other than 1 or 0 with ∗. Identifywhich matrices correspond to no solution, which correspond to infinitelymany solutions, and which correspond to a unique solution.

10. Write all of the possible augmented reduced echelon matrices thatwould correspond to a system of three equations in three unknowns.Denote entries that can be a number other than 1 or 0 with ∗. Iden-tify which matrices correspond to no solution, which correspond toinfinitely many solutions, and which correspond to a unique solution.


11. Based upon your answers to the prior two exercises, what augmentedreduced echelon matrix would correspond to a system of four equationswith four unknowns that has a unique solution? Try to generalize thisto n equations in n unknowns.

12. Find the sequence form of the solution set of the system of equationswhose reduced echelon form is given by the augmented matrix1 −1 0 1 0

0 0 1 −2 00 0 0 0 0

.

Once you have found the sequence form of the general solution, performthe following tasks:

(a) Determine three different specific nonzero solutions.

(b) Show that these specific sequences satisfy the system correspond-ing to the augmented matrix.

(c) Take two of the solutions, add them, and show that the resultingsequence is also a solution.

(d) Take a multiple of one solution and a different multiple of othersolution and add them. Is the sum still a solution of the system?

(e) If we replace the entry at the 15th position with −3 and the entryat the 5th position with 4, is the resulting sum still a solution?

13. Apply elementary row operations to the augmented echelon matrix1 2 −1 30 1 −2 −10 0 1 −2

to transform it into the reduced echelon form.

(a) Write the resulting system of equations.

(b) What is the solution set of this system?

(c) What operations would have to be performed to get this systemback into the echelon form presented above?


14. Show that [0, 0, 0] is a solution of the system

2x1 − x2 + 3x3 = 0

−x1 + 4x2 − 2x3 = 0.

(a) If we replaced the constants with nonzero values, would [0, 0, 0] bea solution?

(b) If you are given any system of equations where each constantis zero, must the sequence consisting of all zeros be a solution?Explain your answer.

15. Show that [0, 0, 0, 0] is a solution of the system whose augmented matrixis given by 2 −2 3 4 0

−1 3 −4 5 03 −4 2 −1 0

.

(a) If we replaced the constants with nonzero values, would [0, 0, 0, 0]be a solution?

(b) Based upon your answer to this and the preceding exercise, canany system with constants that are all zero ever have no solution?Explain your answer.

(c) Can the zero vector be a solution of a non-homogeneous systemof equations? Explain your answer.

16. (a) Are the solution sets of the systems of equations given in twoprevious exercises vector spaces?

(b) Write two systems that have the same coefficients as those in thetwo previous exercises that are not homogeneous.

(c) Find their solution sets.

(d) Are these sets vector spaces?

(e) Can you find a relationship between the form of the systems whosesolution sets are vector spaces?

17. What can you say about the solution set of a homogeneous system ofequations in which there are as many unknowns as equations? Carefullyexplain your answer.


18. If you are given a system of equations that has as many unknownsas rows and that has all zero constants, what can you say about thesolution set? Carefully explain your answer.

19. In the augmented matrix given below, the third column is the sumof the first two columns. What can we say about the solution set ofthe associated system of equations: does it consist of a single solution,infinitely many solutions, or no solution? Justify your answer.

2 3 5 0−5 1 −4 0−3 −1 4 01 0 1 0

20. In the augmented matrix given below, the third column is the sum of

the first column and twice the second column. What can we say aboutthe solution set of the associated system of equations: does it consistof a single solution, infinitely many solutions, or no solution? Justifyyour answer. 1 −2 −3 0

7 4 15 03 2 7 0


130

3.3 A Geometric View of Systems

Activities

All the equations and systems in these activities are in K = R, unless oth-erwise stated in the activity.

1. Given the equation3x− 5y = 4

(a) Find the solution set of the equation. Does the solutions set ofthe equation form a vector space?

(b) Find five specific ordered pairs that are solutions to the equation.Plot them in the coordinate plane. What do you observe?

(c) Using ISETL, draw the graph of the equation. What do you ob-serve? Use plot to plot the solutions you found in part (a) on thesame coordinate system as the graph of the equation when x is in[−3, 3]. What do you find?

(d) Answer using your own words: Where will all the points of thesolution set lie when you plot them on the same coordinate systemas the graph of the equation?

(e) Repeat the previous instructions for the equation

3x− 5y = 0.

Compare the results found here with those you obtained for theequation given at the beginning of the activity. What are thedifferences? What are the similarities?

2. Consider the system of equations,

x+ 2y = 0

x+ 2y = −3.

(a) Graph the solution set of each equation in the same coordinateplane.

(b) Select a point that is a solution of the first equation but not ofthe second one. Locate it on the plane. What do you observe?

3.3 A Geometric View of Systems 131

(c) Select a point that is a solution of the second equation but not ofthe first one. Locate it on the plane. What do you observe?

(d) Select a point that is neither a solution of the second equation norof the first. Locate it on the plane. What do you observe?

(e) Can you find a point that is a solution of both equations? Why?

(f) Find the solution set of the system algebraically, if it exists. Whatdo you find? Compare with your answer to the previous part ofthis activity. What is the relationship between the algebraic resultfor the solution set of the system and the geometric representa-tion? Carefully explain.


3x+ 2y = 5

6x− 7y = −2.





(e) Find the solution set of the given system algebraically, if it exists.Locate the points of the solution set on the coordinate plane.What do you observe?


−6x+ 15y = 6

14x− 35y = −14.






(e) Find the solution set of the given system algebraically, if it exists.Locate the points of the solution set on the coordinate plane.What do you observe?

5. Given a system of two equations in two unknowns,

ax+ by = e

cx+ dy = f,

answer the following questions:

(a) Give conditions on a, b, c, d, e, and f under which a system of twoequations in two unknowns has a unique solution, infinitely manysolutions, or no solution.

(b) Coordinate each set of conditions with the appropriate geometricrepresentation.

(c) Given a graph of the system and the graph of the two equationswhen the system has a unique solution, which points in the co-ordinate graph are solutions of the first equation? the secondequation? both equations? neither equation?

(d) Given a graph of the system and the graph of the two equationsin the case when the system has infinitely many solutions, whichpoints on the coordinate graph are solutions of the first equation?the second equation? both equations? neither equation?

(e) Given a graph of the system and the graph of the two equationsin the case when the system has no solution, which points on thecoordinate graph are solutions of the first equation? the secondequation? both equations? neither equation?

6. Consider the equationx− 2y + z = 0.


(a) Find the solution set of the equation. How many solutions doesthe equation have?

(b) Determine whether the ordered triples

(0, 0, 0), (2, 1, 2), and (2, 1, 0)

are solutions of the given equation. Plot three ordered triplesthat are elements of the solution set of the equation in a three-dimensional space. Can you imagine a plane passing by thosepoints? Chose other three points from the solution set of theequation and plot them on the same graph as the previous three.What do you obtain?

(c) If we were to graph all the points of the solution set, what wouldbe the form of its geometric representation?

(d) Repeat parts (a) and (b) for the equation

x− 2y + z = 10,

using the ordered triples (10, 0, 0), (2, 1, 6) and (1, 3, 12). Com-pare the results found here with those you obtained before. Whatdifferences and what similarities do you find?

7. Consider the system of equations

−x+ y + z = 0

−x+ y + z = −5.

(a) What is the geometric representation of the solution set of eachequation?

(b) Are the geometric representation of the solution set of each equa-tion parallel?

(c) Do the geometric representation of the solution set of each equa-tion intersect? If they do, can you describe the intersection geo-metrically?

(d) Find the solution set of the system. Is the solution set you ob-tained consistent with the geometric representation? Carefullyexplain.


(e) Compare your answers to those of the previous activity. What doyou observe?

(f) Can you write a system of two different linear equations equationsin three unknowns that represents two coincident planes?


x− 3y + 2z = 8

3x− 7y + z = 2.


(b) Are the planes represented by the equations parallel? Why?

(c) Do the planes intersect each other? If they do, can you describethe intersection geometrically?

(d) Find the solution set of the system. Is the algebraic solutionconsistent with the geometric representation? Carefully explain.

(e) Plot the solution set in a three-dimensional space. What do youobserve? Compare with your answer to the previous question, doboth answers agree? Why?


5x+ 2y − z = 11

x− y + z = 1

4x+ 2y + 3z = 5.


(b) Are the planes represented by the equations parallel?

(c) Do the first two planes intersect each other? If they do, can youdescribe the intersection geometrically?

(d) Do the last two planes intersect each other? If they do, can youdescribe the intersection geometrically?

(e) Solve the system formed by the first two equations. What do youfind? What is the geometric representation of this solution set?


(f) Solve the system formed by the last two equations. What do youfind? What is the geometric representation of this solution set?

(g) Do the three planes represented by each of the equations intersect?If they do, can you describe the intersection geometrically?

(h) Find the solution set of the entire system. Is the algebraic solutionconsistent with the geometric representation? Carefully explain.

(i) Plot the solution set in a three-dimensional space. What do youobserve? Compare with your answer to the previous question, doboth answers agree? Carefully explain.


3x− y − z = 5

x− 5y + z = 3

x+ 2y − z = 1.










3x− y − z = 5

x− 5y + z = 3

x+ 2y − z = 2.










12. Consider the following systems of equations,

(1)

2x+ 3y − 8z = 0

x− 2y + 3z = 0

5x− 3y + z = 0

(2)

2x+ 3y − 8z = 1

x− 2y + 3z = 4

5x− 3y + z = 13.

(a) What is the geometric representation of the solution set of eachequation in the two systems?

(b) Given the ordered triple (1, 2, 1), test whether it is a solution toSystem (1). Test whether it is a solution to System (2). What doyou find?

(c) Given the ordered triple (−2,−4,−2), test whether it is a solutionto System (1). Test whether it is a solution to System (2). Whatdo you find?

(d) Calculate the ordered triple found by adding (1, 2, 1) and (1, 4, 13).As you can see, this ordered triple is the same vector as (1, 2, 1)but it has been translated to a new position displacing it by addingthe vector (1, 4, 13). Is the resulting ordered triple a solution toSystem (1)? Is it a solution to System (2)?


(e) Calculate the ordered triple obtained by adding (−2, 4,−2) and(1, 4, 13). As you can see, this ordered triple is the same vectoras (−2,−4,−2) but it has been translated to a new position dis-placing it by adding the vector (1, 4, 13). Is the resulting orderedtriple a solution to System (1)? Is it a solution to System (2)?

(f) Find the solution set of System (1). Find the solution set of Sys-tem (2). What is the relationship between the algebraic represen-tation of the two solution sets?

(g) Can you find any relationship between the geometric representa-tions of two solution sets? Describe it. Can you find any relation-ship between the two systems?

Discussion

Equations in Two Unknowns

You already know how to find the solution set of an equation or a systemof equations and you also know the meaning of the solution set in algebraicterms. We are interested now in finding out what is the meaning of equations,systems of equations and their solutions in geometric terms.

We will start with equations in two variables. The equation ax + by = ccan be considered as a system consisting only of one equation. What is itssolution set? You already know that the solution set of that system over Rconsists of an infinite set of points in a two-dimensional space. In Activity 1you plotted the solution set of two such equations in the coordinate plane.As you found out the geometric representation of a single linear equation intwo variables has a line in the coordinate plane as its solution set. Whatis the slope of the line that represents the solution set of the first equationin Activity 1? What is its y-intercept? What is the slope of the line thatrepresents the solution set of the second equation in Activity 1? What is itsy-intercept? There is a difference between the two equations given in thisactivity, however. The solution set of one equation forms a vector subspaceof R2, while the other does not. Which is which? When you compared thetwo lines representing the two equations you could observe that the lines areparallel, they have the same slope. The second line is a line through theorigin and the first one is a line through

(0,−4

5

), this line can be considered


as the translation of the line through the origin to the point(0,−4

5

).

As you recall from the previous sections, you had demonstrated that youcould obtain the solution set of a non-homogeneous system of equations fromthe solution of the associated homogeneous system if you knew one particularsolution. What is the particular solution in this case? Is the relationship be-tween the geometric representation consistent with the algebraic relationshipjust mentioned?

In Activity 2 the equations are also in K = R. You were asked tofind out if the components of a point which is a the solution of the firstequation is a solution of the other. When you plotted this point on the planeyou noticed that it was located on the line representing the solution set ofthe first equation, but not on the line representing the solution set of thesecond equation. You also noticed that when you picked a point which isnot a solution of either equation, it was located on the plane in a place thatwas not located on either of the lines representing the solution sets of theequations. You finally realized that the lines representing the solution setsof both equations do not intersect, that is that they do not have any pointin common. When you solved the system you found out that the solutionset is empty, there are no points in the solution set. You could then verifythat when the solution set of the system is empty, the lines representing theequations of the system do not intersect. Can you explain this result in termsof the result of the previous activity?

In Activity 3 you were presented again with a system of two equations inK = R and two unknowns. You realized that the geometric representationof each solution set of those equations is a line. Why? What are their slopes?What are their y-intercepts? In this case the two lines are not parallel; theyintersect in one point. Again, you were asked to find out if the componentsof a point, which is a the solution of the first equation, is a solution of theother. When you plotted this point in the plane you noticed that it waslocated on the line representing the solution set of the first equation, butnot on the line representing the solution set of the second equation. As inthe previous activity you also noticed that the point which is not a solutionof the first equation, but is a solution of the second is located on the linerepresenting the second equation. The point which is not a solution of eitherequation is again located in the plane outside the two lines representing eachof the equations. Then you were asked to solve the system. This time youfound the system has a unique solution, which is precisely a point on theintersection of the lines representing the solution sets of the two equations.


The lines represented by the equations of the system over R given inActivity 4 are parallel. Why? When you chose a point that was a solutionto the first equation you noticed that the same point was always a solutionto the second equation and vice versa. Was it possible to find a point thatsatisfied the requirements given by the activity? When you solved the systemyou realized that any point that is a solution of one equation is a solution ofthe other. When you plotted both equations you noticed that the particularlines they represent lay one over the other, that is, they are two different waysto write the equation of the same line, so any point on one line is also on theother. The system has an infinite number of solutions that are representedon the plane by all the points that are on the line. On the other hand, pointsthat don’t satisfy any of the equations are not solutions of any of them.

The system presented in K = R in Activity 5 is a general representationof two equations with two unknowns. If you give particular values to theparameters a, b, c, and d you will have a particular system. Depending onthe values assigned to these parameters you can have different situations,but it is always true that each of the equations represent a line. What isthe relationship between the slope of the first equation given by −a

b, the

slope of the second equation given by −cd

, and the existence of exactly one orinfinitely many solutions? If the slopes are equal under which condition dothe y-intercepts, denoted by e

band f

d, correspond to a system with infinitely

many solutions, represented by coincident lines, or a system with no solution,represented by parallel lines?

As you have seen in previous activities, systems of two equations and twounknowns over R can have one solution, an infinite number of solutions or nosolution at all. You already know from previous sections that the system willhave one solution when the system can be row reduced without obtaininga row of zeros, that it will have an infinite number of solutions if it can berow reduced and you obtain a row that consists only of zeros and that it hasno solution if the row reduction process leads to a row in which every entryexcept the last entry is zero. Now you also know that each of the equationsof the system can be represented geometrically by a line. The number ofsolution depends on the position of the two lines in the plane. In previousactivities you found that the condition for the system to have one solutionis that the equations represented by each of the equations are not parallel,that is, that they have different slopes. If the two lines represented by theequations have the same slope there are two possibilities: if they have thesame y-intercept, the equations are representations of the same line, or you


might say that one of the lines lays exactly over the other line and that iswhy they intersect in an infinite number of points; if the lines represented bythe equations do not have the same y-intercept, they are parallel, they donot intersect and the system has no solution. Can you find a condition forthe solution set of a system of equation in K = R to be a vector space? Canyou relate this condition to the geometric representation of the solution set?

Equations in Three Unknowns

Equations with three unknowns in R represent, as you observed in Activity 6,planes in a three-dimensional cartesian space R3. You may recall that theequation of a plane in a three-dimensional space can be found by identifyinga vector that is normal to all the vectors that lie in the plane and a pointthat lies in the plane. A normal vector can be constructed by taking twovectors that lie on the plane and obtaining the cross product of the twovectors. Since the plane consists of the set of all vectors that are normal tothis vector, the equation of the plane passing through the point (x0, y0, z0)and normal to the vector 〈a, b, c〉 is the set of all points (x, y, z) such that

〈a, b, c〉 · 〈x− x0, y − y0, z − z0〉 = 0,

as shown in Figure 3.1. In Activity 6, the normal vector is given by thecoefficients of the variables, which in this case is 〈1,−2, 1〉. The plane passesthrough the point (0, 0, 0), since the equation has no nonzero constant. Ifthere were a nonzero constant, a point could be identified by selecting valuesfor, say y and z, and solving the resulting equation for x. Finding theequation for a plane will be dealt with later in the text. In the case ofthe equation x−2y+z = 0, you can verify that the normal vector is given bythe coefficients of the variables of the equation, that is 〈1,−2, 1〉 in the caseof the example, and 〈x0, y0, z0〉 are the coordinates of a point on the plane,〈0, 0, 0〉 in the example. We will come back to this in later chapters.

In Activity 6 you plotted the equation of the plane corresponding to thegiven equation and verified that the points in the solution set are on the plane.As in Activity 1, you were asked to consider a second equation that had thesame coefficients as the previous one but passes through different points. Asthe coefficients are the same, the planes represented by the two equationsare parallel. Can you explain why? As in Activity 1 when you compared thesolution sets of both equations you noticed that you could obtain the solutionset of the non-homogeneous system of equations from the solution of the


v = <1,-2,1>v = <1,-2,1>

Figure 3.1: Plane with normal vector

associated homogeneous system if you knew one particular solution. Again,the second equation represents a plane with the same directions as thatrepresented by the first equation but that is translated to another location.Can you describe the specific translation here? One of the two equationsyielded a solution set that is a subspace of R3. Which is it?

In Activity 7 you considered a system of two equations and three un-knowns over R. In this case you could verify that each of these equationsgeometrically represents a plane in a three-dimensional cartesian space. If atriple is a solution of one of the equations, it has to represent a point on theplane represented by the equation. Working with the system of the equationsyou noticed that the solution set is empty. In geometric terms it means thatthe planes represented by the two equations have no points in common, theyare parallel. This condition can be verified by comparing the coefficients ofthe variables in the equations and the independent constants. Exactly howdoes this help us? Under what condition does a system of two equations inthree unknowns in R yield a nonempty solution? Under what conditions twoplanes coincide? What is the solution set of a system of two equations andthree unknowns in R in which the corresponding planes coincide?

If two equations represent the same plane, then not only the normalvectors that define the planes have to be parallel, but all the points that liein one plane have to lie in the other set. What does this mean in terms ofthe solution of the system formed by the equations of the two planes?

The planes represented by the equations in Activity 8 are not parallel.


You know this because the normal vectors represented by the coefficientsof the equations are not parallel. As the planes are not parallel, they willintersect. What is the geometric representation of the intersection of theplanes? When you solved the system you found out that the solution set ofthe system has an infinite number of points. When you plotted the solutionset in a three-dimensional space you found that the solution space can berepresented by a line. That line lies in the intersection of both planes, asindicated in Figure 3.2.

Figure 3.2: Intersecting planes from Activity 8

Systems of Three Equations in Three Unknowns

The planes represented by each of the three equations of the system givenin Activity 9 are not parallel. Why? By looking at the first two planes youfound out that there is an infinite number of solutions in the solution set ofthat system. Those solutions are on a line which lies in both planes. Thesame result holds for a system formed by the last two equations. What doyou expect the intersection of those lines will be? When you solved the entiresystem, you discovered a single solution. Is this point of intersection of thelines representing solution sets of the systems defined by the first and secondequations and the second and third equations?

In the case of Activity 10, you followed the same instructions as in Activ-ity 9. You probably expected the intersection of the three planes to representagain a single point in R3 because the three planes are not parallel. In this


activity, however, the system has an infinite number of solutions, and the geo-metric representation of the solution set is a line in the space. See Figure 3.3.

Figure 3.3: Intersecting planes as in Activity 10

The system in Activity 11 is similar to that given in Activity 10: in thatthe planes are not parallel. This means that there is no point of intersection;the solution set in this activity is empty, despite the fact that each pair ofequations, first/second, and second/third, yields a solution set representedby a line. Do lines formed by each pair intersect?

In Activity 12 you were given two systems that differed only in terms oftheir constants. You tested whether adding a specific solution of System (2)to a solution of System (1) would yield a solution of System (2). What didyou find? If you think in geometric terms, the solution set of System (2), anon-homogeneous system, can be thought of as a translation of the solutionset of System (1), a homogeneous system. Can you explain why?

Exercises

All the equations and systems in these exercises are in K = R, unless other-wise stated in the activity.

1. In this section you found that a linear equation in two variables over Rrepresents a line in a two-dimensional space, and that a linear equationin three variables represents a plane in a three-dimensional space. What


can you say about the geometric representation of the solution of alinear equation in more than three variables?

2. Explain in geometric terms why no system of linear equations can haveexactly two solutions.

3. Describe the solution set of the system of equations kx + y = −5 andk(x+ 2)− y = 3

4. Determine the value of k such that (4− 1, 1), is on the plane describedby the solution set of the equation kx+ 3y − kz = −7

5. Given the equation 5x− 7y + 11 = 0, what is the geometric represen-tation of its solution set? Write a general equation that describes allthe possible lines that are parallel to this one.

6. Verify that the solution set of the system of equations given by x +3y + z = −9, 4x+ 3y − 2z = −12 is a line.

7. Write the system of equations that yields the xy-plane as a solution.Write the system of equations that yields the yz-plane as a solution.Write the system of equations that yields the xz-plane as a solution.

8. Consider the system of equations given by

x+ 4y − 5z = 0

2x− y + 8z = 9.

Describe the geometric representation of the solution set of each equa-tion. Then describe the geometric representation of the solution set ofthe system.

9. Suppose that the solution set of a system of equations is given by

x = 4− 3z

y = −1 + 6z,

where z is a free variable. Use what you know about vectors to describethis solution set as a line in R3.


10. Suppose that the solution set of a system of equations is given by

x = 7 +−wy = −5− 2w

z = 5 + 2w,

where w is a free variable. Use what you know about vectors to describethis solution set as a line in R4.

11. Compare geometrically the solution sets of the following systems ofequations:

(1)

6y − 18z = 0

x+ 2y + 3z = 0

2x+ 3y + 9z = 0

(2)

6y − 18z = 24

x+ 2y + 3z = 6

2x+ 3y + 9z = 8.


Chapter 4

Linearity and Span

Vectors and sets of vectors return as the featuresfor discussion. You may have thought we weredone with vector spaces as we looked at systemsof equations. However, we will need to solvesome systems that arise as we spend timefamiliarizing ourselves with the elements ofvector spaces. The main question before us is“What is the smallest subset of a vector spacethat can be used to get the whole thing?” Theanswer to that may surprise you.

148

4.1 Linear Combinations

Activities

1. Let V = (Z5)3 be the vector space of triples of elements of Z5. Parts(a)–(c) refer to V and it is assumed that name vector space has beenrun.

Let v = 〈1, 2, 3〉 and w = 〈2, 1, 4〉 be two elements in V .

(a) Use ISETL to calculate the vector obtained by multiplying v bythe scalar 2.

(b) Use ISETL to calculate the vector obtained by adding the vectorsv and w.

(c) Use ISETL to calculate the vector obtained by first multiplying vby the scalar −2 and w by the scalar 3 and then adding the tworesulting vector/scalar products together.

2. Write a func LC that will assume name vector space has been run;that will accept two inputs SK and SV , where SK denotes a sequenceof scalars, and SV represents a sequence of vectors of the same length;and that will return a vector constructed by taking the linear combina-tion of SV with respect to the sequence SK; that is, the combinationformed by first multiplying each vector in SV by its correspondingscalar in SK and then adding together the resulting vectors.

Apply LC to any sequence of four nonzero vectors and four nonzeroscalars from the vector space V = (Z5)6.

Try to use % (see Section 1.2, p. 27) in writing this func.

3. Let V = (Z5)3. Use the ISETL func LC you constructed in Activity 2to perform the tasks given in parts (a)–(b) below.

(a) Let u = 〈1, 2, 1〉, v = 〈3, 1, 4〉, and w = 〈4, 0, 2〉 be a sequenceof three vectors in V . Write all the possible sequences [b, c, d]of three scalars based upon the possible choices of b ∈ {0, 1},c ∈ {2, 3}, and d ∈ {4}. Then, apply LC to [u,v,w] and eachscalar sequence. Identify which combinations yield the zero vector.

4.1 Linear Combinations 149

Are your results consistent with those you would get by computingeach combination by hand?

(b) Use tool Three eqn you constructed in Chapter 3 to find the valuesof a, b, and c that make the following statements true. Once youhave found these values, write an ISETL statement that uses LC

to verify your results.

i. a 〈2, 1, 3〉+ b 〈1, 2, 1〉+ c 〈4, 0, 2〉 = 〈0, 0, 0〉ii. a 〈3, 1, 1〉+ b 〈4, 1, 2〉+ c 〈2, 2, 3〉 = 〈0, 0, 0〉

4. Write a func that assumes that name vector space has been executed;that will accept two inputs SK and SV , where SK and SV are definedas in Activity 2; and that will return a boolean value obtained byapplying LC to the pair SK and SV to check whether the resultingcombination LC(SK,SV) is equal to the zero vector.

Test your func on the linear combinations from Activity 3, part (a).

5. Consider the following ISETL code.

UKn:=func(n);

if n=1 then return {[s] : s in K};

else return {t with s : t in UKn(n-1), s in K};

end;

end;

Explain how this program is executed in the case in which K = Z3 andn = 2; K = Z3 and n = 3; K = Z3 and n = 4. Then, explain howthis program is executed in general. What is your interpretation of theresult of running this program?

6. Write a func All LC that assumes that name vector space has beenexecuted; that will accept a single input SETV , where SETV denotesa set of vectors in V ; and that will return the set of all possible linearcombinations of SETV . Use UKn in your program and note that if youwant to use LC on SETV , then you will first have to convert it from aset to a sequence.

7. Use the vector space V = (Z3)4 to complete parts (a)–(c) below.

150 CHAPTER 4. LINEARITY AND SPAN

(a) Construct five different sets of vectors, each set consisting of fourvectors from V .

(b) For each set in (a), compute all possible linear combinations byhand. (Leave these combinations unsimplified.)

(c) Apply All LC to each set in (a) to “simplify” all of the combina-tions you constructed in (b) How many different linear combina-tions do you get for each of these sets?

(d) For each of these sets, which linear combinations yield the zerovector?

8. Write a func LU that assumes that name vector space has alreadybeen executed; that will accept two inputs v and SETV from a vectorspace V , where v is any vector, and SETV is a set of vectors; and thatwill return a boolean value obtained by determining whether v can bewritten as a linear combination of SETV in one and only one way.

For the vector space V = (Z7)4, use LU to determine the answer to thefollowing two questions.

(a) Can the vector v = 〈3, 4, 1, 2〉 be expressed uniquely as a linearcombination of the set

SET1V = {〈1, 2, 1, 0〉 , 〈3, 0, 1, 2〉 , 〈2, 1, 0, 1〉 ,〈4, 0, 3, 5〉 , 〈5, 3, 0, 3〉}

(b) Can the vector v = 〈3, 4, 1, 2〉 be expressed uniquely as a linearcombination of the set

SET2V = {〈1, 0, 0, 0〉 , 〈3, 0, 1, 2〉 , 〈2, 1, 0, 1〉 , 〈4, 0, 3, 5〉}

9. Consider the vector v = 〈1, 2〉 in R2. Use the ISETL func vectors toview this vector in the plane. Let t = ±2,±3,±0.5. For each value oft, use vectors to graph the scalar product tv.

(a) What do you observe? Based upon these examples, what does theset

{t · v : t ∈ R};

look like when it is graphed in the plane?


(b) Explain why it looks this way.

10. Consider the vectors v = 〈1, 3〉 and w = 〈−1, 2〉 in R2. Let a =0.1, 0.2, 0.3, . . . , 1. Use vectors to view each possible combination:take the product of a with v and (1 − a) with w, and add the resultstogether: in short, graph av +(1−a)w for each value of a given above.What do you observe? Based upon these examples, what do you thinkthe set

{av + (1− a)w : a ∈ [0, 1]};

looks like when it is graphed in the plane?

11. Consider the vectors v = 〈1, 3〉 and w = 〈−1, 2〉 in R2. Let a =0.1, 0.2, . . . , 4. Use vectors to view the following combinations: takethe product of a with w and add it to v; in short, graph v + aw foreach value of a given above.

(a) What do you observe? Based upon these examples, what do youthink the set

{v + aw : a ∈ R};

looks like when it is graphed in the plane?

(b) How would it look if you let a run though all values in [0,∞)?How about (−∞, 1]? (−∞,∞)?

12. Consider the differential equation,

f ′′ + f = 0,

where the function f is in C∞(R).

Find three functions which are solutions to this differential equation.Then choose any three scalars in R and use them to form a linearcombination with your three functions. Is this linear combination alsoa solution?


Discussion

The Difference Between a Set and a Sequence

In this section of activities, you may have noticed that some activities referto a set of vectors or scalars and others refer to a sequence of vectors orscalars. Do you recall the difference between a set and a sequence? FromChapter 1 you will recall that a set in ISETL is designated by curly braces{ }, whereas a sequence is denoted by square brackets, [ ] and is called atuple. What properties differentiate sets from tuples?

In addition to having different properties, sets and tuples are conceptuallydifferent. Lets review some properties that you worked with in Chapter 1.A sequence is a function whose domain consists of the set or a subset of thepositive integers and whose range can be any set. How would you see a listlike a1, a2, a3, . . . , an, . . . as a function? If the name of the function in thiscase is f , what would be meant by f(1), f(2), f(3)? In the context in whichwe are working, we can focus upon the listing representation, but, because asequence is a function, it is not just any list, it is an ordered list. Where doesthe order come from? For example, the sequence [4, 5, 6] is different than thesequence [6, 4, 5] because of the difference in the order of the presentation ofthe elements. On the other hand, a set is an unordered collection of objects.So, the set {4, 5, 6} is equal to the set {6, 4, 5}. Additionally, if an element isrepeated in a sequence, say [4, 5, 5, 6], the repeated 5 cannot be dropped likeit would if we were talking about the set {4, 5, 5, 6}, which is actually equalto the set {4, 5, 6}.

This distinction comes up when we have specific scalars and specific vec-tors that we want to use in forming a linear combination. Thus, in Activity1(c) we had the scalars −2, 3 and the vectors v,w. We wanted to form thelinear combination −2v + 3w. This means that we are thinking about thesequence of scalars [−2, 3] and the sequence of vectors [v,w] and not sets ofsequences or scalars. What sequences would we use if we wanted to form thelinear combination 3v − 2w? or −2w + 3v?

Can you explain why this means that in Activity 2, the func LC hasto take inputs SK, SV which are respectively, a sequence of scalars and asequence of vectors? On the other hand in Activity 6, the func All LC takesa set of vectors. What is the difference? Since All LC calls the func LC, howdid you deal with the fact that All LC receives a set of vectors, but LC needs


a sequence to work with? Was a conversion involved here?

Forming Linear Combinations

In most of your work so far, each vector space has been of the form (K)n,where each vector consists of n-tuples or vectors whose components are ele-ments of the scalar set K. A vector space does not generally have to be ofthis form (for example, Pn(K) and C∞(R)). However, many of the vectorspaces that we will encounter in this course will have elements consisting oftuples of entries from the scalar set. Indeed, we will discuss in Section 4.4,that there is a sense in which every vector space is essentially the same as avector space of tuples of elements of the scalar set.

No matter what form its elements assume, a vector space is always definedover, or accompanied by, a set of scalars. For this reason, whenever a vectorspace is defined, it is common to use the phrase: Let V be a vector spaceover a field K. For our purposes, we do not need to know what a field is; youwill study that concept in abstract algebra. In this course, the scalar fieldwill be a familiar set like the real number system or a finite set such as Z2,Z3, Z5, or Z7.

Given this relationship between a vector space and its scalar field, howwould you, given two vectors v and w in a vector space V and two scalars aand b in its associated scalar field K, explain how to multiply v by a? addv and w? compute the combination av + bw?

In Activity 1, you were asked to perform this series of operations. Inparticular, given the set {v = 〈1, 2, 3〉 ,w = 〈2, 1, 4〉} in (Z5)3, you were askedto form three combinations: 2v in part (a), v+w in part (b), and −2v+3w inpart (c). As it turns out, these combinations represent three of the possiblelinear combinations of the set of vectors {v = 〈1, 2, 3〉 ,w = 〈2, 1, 4〉}. Ifwe let a, b ∈ Z5, where a ∈ {1, 2, 3} and b ∈ {2, 4}, what are the linearcombinations of the form av + bw for the given values of a and b?

In Activity 2, you constructed a func that accepted as input a sequence ofvectors SV and a sequence of scalars SK and returned the linear combinationof the corresponding pair as output. If we keep the vectors in the order inwhich they were given in Activity 1, then SV = [v,w], with SK = [2, 0] inpart (a), SK = [1, 1] in part (b), and SK = [−2, 3] in part (c). In Activity 3,SV = [〈1, 2, 1〉 , 〈3, 1, 4〉 , 〈4, 0, 2〉], and, based upon the choices for b, c, andd, the possible scalar sequences would be of the form [0, 2, 4], [0, 3, 4], [1, 2, 4],and [1, 3, 4]. If we let SV = [v1,v2,v3] be an arbitrary sequence of vectors,


and if we let SK = [a1, a2, a3] be an arbitrary sequence of scalars, what wouldbe the form of the linear combination of the sequence SV with respect to thesequence SK? The definition given below discusses how to form such linearcombinations in general.

Definition 4.1.1. Let V be a vector space over the field K. Let SV =[v1,v2, . . . ,vq] be a sequence of q vectors in V , and let SK = [a1, a2, . . . , aq]be a sequence of q scalars in K. The linear combination of the sequence SVwith respect to the scalar sequence SK is given by:

a1v1 + a2v2 + · · ·+ aqvq.

Simplified Single-Vector Representations

Let V = R4, the vector space of 4-tuples with real-number entries. Let

SK = [3, 2, 5] and SV = [〈2, 4, 3, 1〉 , 〈−3, 0, 1, 5〉 , 〈3, 2, 6, 4〉].

If we want to express the linear combination of SV with respect to SK asa single vector, we would explicitly perform the operations indicated by thedefinition; in particular,

3 〈2, 4, 3, 1〉+ 2 〈−3, 0, 1, 5〉+ 5 〈3, 2, 6, 4〉= 〈6, 12, 9, 3〉+ 〈−6, 0, 2, 10〉+ 〈15, 10, 30, 20〉

= 〈15, 22, 41, 33〉 .

If we continue to let V = R4, but if, in this case, we let

SK = [a1, a2, a3],

SV = [v1,v2,v3]

= [〈v11, v12, v13, v14〉 , 〈v21, v22, v23, v24〉 , 〈v31, v32, v33, v34〉],

how would we express the linear combination of SV with respect to SK inthe form of a single vector?

In general, if V = (K)n, the vector space of n-tuples with entries in K,and if

SV = [v1,v2,v3, . . . ,vq]

is a sequence of q vectors in V , then the single-vector form of the linearcombination of SV with respect to the scalar sequence

SK = [a1, a2, a3, . . . , aq]


would be given as follows:

a1v1 + a2v2 + · · ·+ aqvq

= a1 〈v11, v12, v13, . . . , v1n〉+ a2 〈v21, v22, v23, . . . , v2n〉+a3 〈v31, v32, v33, . . . , v3n〉+ · · ·+ aq 〈vq1, vq2, vq3, . . . , vqn〉

= 〈a1v11, a1v12, a1v13, . . . , a1v1n〉+ 〈a2v21, a2v22, a2v23, . . . , a2v2n〉+〈a3v31, a3v32, a3v33, . . . , a3v3n〉+ · · ·+ 〈aqvq1, aqvq2, aqvq3, . . . , aqvqn〉

=⟨

(a1v11 + a2v21 + a3v31 + · · ·+ aqvq1), (a1v12 + a2v22 + a3v32 + · · ·+ aqvq2),

(a1v13 + a2v23 + a3v33 + aqvq3), . . . , (a1v1n + a2v2n + a3v3n + · · ·+ aqvqn)⟩

Geometric Representation

Let V = R2, the vector space of ordered pairs with real-numbered entries.If v = 〈v1, v2〉 ∈ V , the ordered pair (v1, v2) is represented geometrically byan arrow whose initial point is the origin (0, 0) and whose terminal point hascoordinates given by the ordered pair (v1, v2). If we let v1 = 3 and v2 = 2,the set of all linear combinations of the vector 〈3, 2〉, which was graphed inActivity 9, is represented algebraically by

{c 〈3, 2〉 : c ∈ R}.

How would you describe the graph of this set? Take a sheet of paper, drawit by hand and compare it with what you got on the screen in Activity 9.What do you observe?

How about Activity 10?You should have discovered that the graph of this set of vectors is a line

that passes through the origin. If you recall, every line in the plane can berepresented by an equation of the form y = mx+ b, where m is the slope ofthe line, and b is the y-intercept. Since this line passes through the origin,the value of b is zero. What is the value of m? To answer this question, weneed to identify two points that lie on the line: certainly, one is (0, 0); if welet c = 1, another point is (3, 2). Based upon these two ordered pairs, wesee that the equation of the line is y = 2

3x. This is precisely the relationship

that exists between the first and second coordinates of any vector in R2 thatis an element of the set

{c 〈3, 2〉 : c ∈ R}.


1 2 3 4

1

2

O

v = <3,2>

Figure 4.1: {c 〈3, 2〉 : c ∈ R}

In particular, if

(x, y) ∈ {c 〈3, 2〉 : c ∈ R},

and if x 6= 0 and c 6= 0, then the relationship between x and y can berepresented by the ratio

y

x=

2c

3c=

2

3;

that is, y = 23x. Of course, this relationship continues to hold in the case in

which x and y are both zero.Conversely, any solution of the equation y = 2

3x can be represented in

vector form by the tuple⟨x, 2

3x⟩. This form can be simplified to

1

3x 〈3, 2〉 .

Since x is an arbitrary real number, 13x can represent any real number, which

means that if we let c = 13x, the above scalar multiple can be written in the

form c 〈3, 2〉 . Therefore, any vector whose components satisfy the equationy = 2

3x is an element of the set {c 〈3, 2〉 : c ∈ R}.

Therefore, the set of vectors in R2 whose components satisfy the equationy = 2

3x is equal to the set of vectors given algebraically by {c 〈3, 2〉 : c ∈ R}.

Go back to Activity 9: Does the graph of the set of vectors given in thatexercise form a line in the plane? If so, what is the equation of the line? Whatis the relationship between the first and second coordinates of the vectors in


{tv : t ∈ R}? Does the relationship, if any, reflect what you found in theprevious example? In general, can the graph of any set of vectors of the form

{c 〈a, b〉 : c ∈ R},

where a 6= 0 or b 6= 0 be represented as a line through the origin whose slopeis given by the ratio b/a? Explain your answer.

In Activity 11, you were asked to find the graph of the linear combination

{v + aw a ∈ R},

where v = 〈1, 3〉 and w = 〈−1, 2〉. This set of vectors is exactly the same asthe set of vectors specified by the xy-equation y = 5− 2x. How do we showthis?

1 2 3

1

2

4

3

w = <-1,1>

v + w

v = <1,2>

v + 2w

v - w

Figure 4.2: {v + aw : a ∈ R}

Let 〈x, y〉 ∈ {v + aw : a ∈ R}. Then,

(x, y) = v + aw= (1, 3) + a(−1, 2)

= (1− a, 3 + 2a).

Since x = 1 − a and y = 3 + 2a, it follows that y = 5 − 2x. Hence, thecomponents of every vector in {v + aw : a ∈ R} form a solution of theequation y = 5− 2x.


On the other hand, each solution of the equation y = 5 − 2x can beexpressed in vector form as

〈x, 5− 2x〉 .

This is equivalent to the vector sum

〈1, 3〉+ 〈x− 1, 2− 2x〉 .

If we let c = x− 1, then the sum becomes

〈1, 3〉+ 〈x− 1, 2− 2x〉 = 〈1, 3〉+ 〈c,−2c〉 = 〈1, 3〉+ (−c) 〈−1, 2〉 .

Since x is an arbitrary real number, c is an arbitrary scalar, which meansthat −c is also arbitrary. So, if we let a = (−c), we get

〈1, 3〉+ a 〈−1, 2〉 ,

from which it follows that every vector whose components are a solution tothe equation y = 5− 2x is a vector of the form

〈1, 3〉+ a 〈−1, 2〉 .

Therefore, the solution set, in vector form, of the equation 5− 2x is equal tothe set of vectors {〈1, 3〉+ a 〈−1, 2〉 : a ∈ R}.

How is the set of vectors {v + aw : a ∈ R} related to {aw : a ∈ R}?

w

v

Figure 4.3: {sv + tw : s, t ∈ R}


Let V = R3 be the vector space of ordered triples of real numbers. If

v = 〈v1, v2, v3〉 ∈ V and w = 〈w1, w2, w3〉 ∈ V,

then the ordered triples 〈v1, v2, v3〉 and 〈w1, w2, w3〉 represent two arrowswhose initial points are the origin (0, 0, 0) and whose terminal points havecoordinates given by (v1, v2, v3) and (w1, w2, w3), respectively. If v and w arenot multiples of each other, that is, w 6= cv for any scalar c ∈ R, then theset of all possible linear combinations of v and w, denoted by the set

{sv + tw : s, t ∈ R},

forms the plane generated by {v,w}. The vectors v,w are referred to asgenerators of this plane. If you recall from multivariable calculus, every pairof non-parallel vectors, that is, vectors that are not multiples of one another,defines a plane. Also recall that course the two binary operations on vectors:the dot product and cross product. The equation of a plane is obtained byidentifying a normal vector, say 〈a, b, c〉, formed by taking the cross productof the two generators, and simplifying the resulting dot product equation

〈a, b, c〉 · 〈x− x0, y − y0, z − z0〉 = 0,

where (x0, y0, z0) is any fixed point in the plane, (x, y, z) is any arbitrarypoint in the plane, and 〈x− x0, y − y0, z − z0〉 refers to the directed linewhose initial point (x0, y0, z0) and whose terminal point is (x, y, z).

To understand better the connection between the set of all linear com-binations of a generating set and the plane formed by two generators, let’sconsider the following example. Let v = 〈1, 2, 3〉 and w = 〈2, 3, 1〉. The setof all linear combinations of v and w is denoted by the set

{s 〈1, 2, 3〉+ t 〈2, 3, 1〉 : s, t ∈ R}.

Since v = 〈1, 2, 3〉 and w = 〈2, 3, 1〉 are not multiples of each other, 〈1, 2, 3〉and 〈2, 3, 1〉 define a plane whose normal vector is 〈−7, 5,−1〉, found bytaking the cross product of 〈1, 2, 3〉 and 〈2, 3, 1〉. This yields the followingequation, when using the point (1, 2, 3):

〈−7, 5,−1〉 · 〈x− 1, y − 2, z − 3〉 = 0

−7(x− 1) + 5(y − 2)− 1(z − 3) = 0

−7x+ 7 + 5y − 10− z + 3 = 0

7x− 5y + z = 0.


We claim that the solution set of 7x− 5y + z = 0 is the set of vectors

{s 〈1, 2, 3〉+ t 〈2, 3, 1〉 : s, t ∈ R}.

In order to prove this claim, we must show that every solution of the equation7x− 5y + z = 0 is a linear combination of 〈1, 2, 3〉 and 〈2, 3, 1〉, and then wemust prove that every linear combination of 〈1, 2, 3〉 and 〈2, 3, 1〉 is a solutionof the equation 7x− 5y + z = 0.

Every solution of the equation 7x− 5y+ z = 0 can be written as a vectorin the form 〈x, y, 5y − 7x〉. To see that this vector is an element of the set{s 〈1, 2, 3〉+t 〈2, 3, 1〉 : s, t ∈ R}, we must show that 〈x, y, 5y − 7x〉 is a linearcombination of 〈1, 2, 3〉 and 〈2, 3, 1〉. In particular, we must find scalars s andt such that the equation

〈x, y, 5y − 7x〉 = s 〈1, 2, 3〉+ t 〈2, 3, 1〉

holds. Simplifying, we obtain,

〈x, y, 5y − 7x〉 = 〈s, 2s, 3s〉+ 〈2t, 3t, t〉= 〈s+ 2t, 2s+ 3t, 3s+ t〉 ,

which is a system of 3 equations in the 2 unknowns s and t,

s+ 2t = x

2s+ 3t = y

3s+ t = 5y − 7x.

In Chapter 3 you worked with such systems and developed methods forfinding the solution set. Can you use those methods to show that the solu-tions are given by:

s = 2y − 3x

t = 2x− y.

Hence, every vector 〈x, y, z〉 whose coordinates x, y, and z are a solution ofthe equation 7x− 5y + z = 0 is a linear combination of 〈1, 2, 3〉 and 〈2, 3, 1〉,where s, t are 2y − 3x, 2x− y, respectively.

Next, we want to show that every element of the set

{s 〈1, 2, 3〉+ t 〈2, 3, 1〉 : s, t ∈ R}


is a solution of the equation 7x− 5y+ z = 0. Let a 〈1, 2, 3〉+ b 〈2, 3, 1〉 be anelement of the set {s 〈1, 2, 3〉+ t 〈2, 3, 1〉 : s, t ∈ R}. This linear combination,when simplified, is 〈a+ 2b, 2a+ 3b, 3a+ b〉. Substituting each component forthe respective variables x, y, and z yields

7(a+ 2b)− 5(2a+ 3b) + (3a+ b) = 7a+ 14b− 10a− 15b+ 3a+ b

= (7a− 10a+ 3a) + (14b− 15b+ b)

= 0,

which shows that a 〈1, 2, 3〉+ b 〈2, 3, 1〉 is a solution of 7x− 5y+ z = 0. Sincewe have shown that every solution of 7x−5y+z = 0 is a linear combination of〈1, 2, 3〉 and 〈2, 3, 1〉, and since we have proven that every linear combinationof 〈1, 2, 3〉 and 〈2, 3, 1〉 is a solution of 7x− 5y+ z = 0, it follows that the set

{s 〈1, 2, 3〉+ t 〈2, 3, 1〉 : s, t ∈ R},

is the plane generated by the vectors 〈1, 2, 3〉 and 〈2, 3, 1〉 and given by theequation 7x− 5y + z = 0.

What we have shown in this discussion is that the solution set of theequation 7x − 5y + z = 0 is the set {s 〈1, 2, 3〉 + t 〈2, 3, 1〉 : s, t ∈ R}, whichis a plane in R3.

Vectors Generated by a Set of Vectors—Span

Throughout the previous subsection, we have used the terms generator or isgenerated by. This was always in very specific contexts and so the meaningshould have been clear to you. Was it? You need to understand these termsthoroughly and in a general context. In Activity 6, you wrote a func All LC

that formed the set of all linear combinations of vectors taken from a givenset. What does this have to do with the set of vectors generated by a givenset?

In the context of forming linear combinations, we say that the set of alllinear combinations of the vectors u and v, which is given by the set

{su + tv : s, t ∈ K},

is the set of vectors generated by u and v. Consequently, whenever you aregiven the phrase—Find the set of vectors generated by u,v, and w—you are


actually being asked to find the set of all linear combinations of u,v, and w;that is, the set whose form is

{qu + sv + tw : q, s, t ∈ K}.

This term is important enough to warrant a formal definition.

Definition 4.1.2. If S is a set of vectors in a vector space V , then the setgenerated by S is the set W of all linear combinations of vectors in S. Wesay that the elements of S are the generators of W and that W is the spanof S.

Do you think that in the context of this definition, W must turn out tobe a subspace of V ?

What Vectors Can You Get from Linear Combinations?

In Activity 6, you wrote a func to compute all of the vectors you get byforming linear combinations of vectors in a given set, that is, you computedthe set generated by the given set. In Activities 4 and 7, you considered themore specific question of whether one of the linear combinations was equalto the zero vector. Using the computer is one way of solving such problemsand in Chapter 6, you will develop similar methods using matrices.

There is still another way. Go back a few pages where you worked outthe solution set of the equation 7x− 5y + z = 0. What does this have to dowith the set of vectors generated by the set {〈1, 2, 3〉 , 〈2, 3, 1〉}? What doesthe vectors generated by this set have to do with the plane in R3 determinedby 7x− 5y + z = 0?

As we saw earlier, we can check whether a vector can be written as a linearcombination of a set of vectors by solving a vector equation. For instance,suppose we wish to determine whether the vector 〈7, 12, 18〉 can be writtenas a linear combination of {〈1, 2, 3〉 , 〈2, 3, 1〉}. This question, similar to whatyou were asked to do in Activity 3(b) and what was shown above, amountsto asking whether we can find scalars a and b such that the following vectorequation is true:

a 〈1, 2, 3〉+ b 〈2, 3, 1〉 = 〈7, 12, 18〉 .

If we simplify the linear combination on the left and equate components


(why?), we get a system of three equations in the two unknowns a and b:

〈7, 12, 18〉 = a 〈1, 2, 3〉+ b 〈2, 3, 1〉= 〈a, 2a, 3a〉+ 〈2b, 3b, b〉= 〈a+ 2b, 2a+ 3b, 3a+ b, 〉

which, when simplified further, yields

a+ 2b = 7

2a+ 3b = 12

3a+ b = 18.

As it turns out, this system, which you should try to solve yourself, has no so-lution. Hence, the vector 〈7, 12, 18〉 cannot be written as a linear combinationof 〈1, 2, 3〉 and 〈2, 3, 1〉, and the components of the vector, when substitutedinto the expression 7x− 5y + z, would render the equation 7x− 5y + z = 0false. If, on the other hand, the system above had yielded a solution, then〈7, 12, 18〉 could be written as a linear combination of 〈1, 2, 3〉 and 〈2, 3, 1〉;〈7, 12, 18〉 would lie in the plane generated by 〈1, 2, 3〉 and 〈2, 3, 1〉; and thecomponents of the vector, when substituted into the expression 7x− 5y + zwould satisfy the equation 7x− 5y + z = 0.

In Activity 8, you wrote a func that essentially performed the operationwe have been discussing; in particular, the func accepts as input a singlevector v and set of vectors SV and returns a boolean value obtained bydetermining whether v could be expressed as a linear combination of theelements of SV . If the vector v could not be written as a linear combinationof LU, how would LU need to be modified to report such a result?

Actually, Activity 8 did a bit more. It checked, not only whether thegiven vector could be expressed as a linear combination of the given set ofvectors, but also whether this could be done in exactly one way, that is,was the representation unique? The question of whether a vector can beexpressed uniquely as a linear combination of a given set of vectors is veryimportant and will be discussed thoroughly in Section 4.4.


Your work in Activity 12 should have involved two additional vector spaces.One is C∞(R). The other is the vector space of all solutions of the differ-ential equation. For which values in R of a, b are the functions a sin, b cos


solutions to the differential equation? How would you write a general linearcombination of two of these functions? When is it a solution?

Exercises

1. Let V = (Z3)4 be the vector space of 4-tuples with entries in Z3. Foreach of the following pairs of vectors v,w, find all linear combinationsof v and w, and determine which linear combinations yield the zerovector. You may wish to use the func All LC to verify your results.

(a) v = 〈1, 1, 2, 2〉 and w = 〈1, 2, 0, 1〉.(b) v = 〈1, 2, 0, 1〉 and w = 〈2, 1, 0, 2〉.

2. Let V = (Z3)3. Let

S1 = {〈2, 1, 0〉 , 〈1, 2, 1〉}S2 = {〈1, 1, 2〉 , 〈0, 2, 1〉}

be two sets of vectors in V . Find all linear combinations of S1 and ofS2. Do S1 and S2 generate the same set of vectors?

3. Let V = K6, where K = (Z2)2 = {〈x, y〉 | x, y ∈ Z2} and the additiveand multiplicative operations are given by the following formulas: ifs, t ∈ K, then

s+K t = 〈s1, s2〉+K 〈t1, t2〉= 〈(s1 + t1)mod 2, (s2 + t2)mod 2〉

s ·K t = 〈s1, s2〉 ·K 〈t1, t2〉= 〈(s1 · t1 + s2 · t2)mod 2, (s1 · t2 + s2 · t1 + s2 · t2)mod 2〉 .

Select any three non-zero vectors from V . Designate one as u, oneas v, and the remaining vector as w. Let A = {〈1, 1〉 , 〈1, 0〉}, B ={〈0, 1〉 , 〈1, 0〉}, and C = {〈1, 1〉 , 〈0, 1〉} be three sets of scalars. Findall linear combinations of the form

au + bv + cw,

where a ∈ A, b ∈ B, c ∈ C.

You may want to use the func LC to verify your result.


4. Give seven vectors in R4 that are in the set generated by

{〈−1, 2, 4,−2〉 , 〈3,−5, 2,−3〉 , 〈1,−1, 2, 1〉 , 〈3,−4, 8,−4〉}.

5. Let V = R3. Determine whether

S1 = {〈−2, 1, 3〉 , 〈−1, 4, 5〉}S2 = {〈3, 1,−4〉 , 〈5, 1, 1〉}

generate the same set of vectors in R3.


S1 = {〈2, 1, 0〉 , 〈−1, 1, 1〉}S2 = {〈1,−1,−1〉 , 〈3, 0,−1〉}



S1 = {〈1, 2, 3〉 , 〈−1, 2, 5〉 , 〈3, 1, 4〉}S2 = {〈1, 6, 11〉 , 〈2, 0,−2〉 , 〈1, 2, 3〉}


8. Modify the func LU you constructed in Activity 8, so that, for anyvector v and set of vectors SV , LU is able to report whether v canbe expressed as a linear combination of SV uniquely (LU reports 1 asoutput), in more than one way (LU reports 2 as output), or not at all (LUreports 0 as output). Test your modified func for every possible vectorv = 〈v1, v2, v3〉 in (Z2)3, when given the set SV = {〈1, 0, 1〉 , 〈0, 1, 1〉}.

9. Let V = R3. For each part, (a)–(d), determine whether the first vectorcan be expressed as a linear combination of the remaining three vectors.

(a) 〈−3, 3, 7〉 ; 〈1,−1, 2〉 , 〈2, 1, 0〉 , 〈−1, 2, 1〉(b) 〈2, 7, 13〉 ; 〈1, 2, 3〉 , 〈−1, 2, 4〉 , 〈1, 6, 10〉(c) 〈−1, 4,−9〉 ; 〈−1, 3, 1〉 , 〈1, 1, 1〉 , 〈0, 1, 4〉(d) 〈4, 3, 8〉 ; 〈−1, 0, 1〉 , 〈2, 1, 3〉 , 〈0, 1, 5〉


10. Let v be a linear combination of two vectors v1 and v2. Show that vis also a linear combination of c1v1 and c2v2, where c1 6= 0 and c2 6= 0.

11. Suppose v is not a linear combination of two vectors v1 and v2. Showthat v is also not a linear combination of c1v1 and c2v2. Try to do thisusing the previous exercise and without any calculations.

12. Let W denote the set of vectors generated by {v1,v2}. If v3 ∈ W ,prove {v1,v2,v3} generates the same set.

13. Show that the solution set of the equation 23x− 9y + z = 0 is the setof vectors

{s 〈1, 3, 4〉+ t 〈2, 5,−1〉 : s, t ∈ R}.

14. Find the set of vectors whose components satisfy the equation y =2x+ 3.

15. Given the set of vectors

{v + aw : a ∈ R},

where v = 〈3,−2〉 and w = 〈2, 5〉, find the equation of the line whosesolution set, when written in vector form, is equal to the set givenabove.

16. Given the set{v + aw : a ∈ R},

where v = 〈3,−2〉 and w = 〈2, 5〉 as in the prior exercise, determinewhat would happen to the graph of this set if the coefficient were al-lowed to vary; that is, if you were given

{bv + aw : a, b ∈ R},

what would this set of linear combinations look like? Draw a graph ofthis set of linear combinations for b = ±1,±2,±3,±.5. What do youobserve? How does the graphical form of this set differ from the casein which b = 1?

17. Find the equation of the plane, given the generating set

{〈−1,−3, 2〉 , 〈3, 0, 2〉}.


18. Find the set of vectors whose components satisfy the equation y = 53x.

What happens to the graph of this set, if each vector is multiplied bythe scalar 2? What happens to the graph of this set, if 〈2, 1〉 is addedto each vector in the set?

19. Show that if W is the span of a set of vectors S in a vector space V ,then W is a subspace of V .

20. In the vector space Pn(K), describe the span of the following sets ofvectors.

(a) {1, x2, x4, . . . , xn div 2}(b) {x, x2, x3, . . . , xn}(c) {1}(d) {x}(e) {1, x}

21. In the vector space PF4(R), in how many ways can you express thepolynomial function

x −→ 2 + 3x

as a linear combination of {1, x, x2, x3, x4}?

22. In the vector space PF4(Z3), in how many ways can you express thepolynomial function

x −→ 2 + 3x

as a linear combination of {1, x, x2, x3, x4}?

23. Let a, b be real numbers, g the function given by g(x) = sin(x), h thefunction given by h(x) = cos(x), and f = ag + bh the linear combina-tion. What is the function f given by?

24. For which real numbers a, b is the function f given by the linear com-bination, f = a sin +b cos a solution to the differential equation

f ′′ + f = 0?

168

4.2 Linear Independence

Activities

1. Let V = (Z2)4 be the vector space of quadruples of elements of Z2. Let

SETV 1 = {〈1, 1, 0, 1〉 , 〈1, 0, 1, 1〉 , 〈1, 1, 1, 0〉}SETV 2 = {〈1, 1, 1, 1〉 , 〈0, 0, 1, 1〉 , 〈1, 1, 0, 0〉}SETV 3 = {〈1, 1, 0, 1〉 , 〈1, 0, 1, 1〉 , 〈0, 0, 1, 1〉}SETV 4 = {〈0, 0, 0, 0〉 , 〈1, 1, 0, 0〉 , 〈0, 0, 1, 1〉}SETV 5 = {〈1, 0, 1, 0〉 , 〈0, 1, 0, 1〉 , 〈1, 1, 1, 1〉}

be four sets of vectors from (Z2)4.

(a) For each set of vectors, write down the expression for each possiblelinear combination. Do not simplify.

(b) Apply the func LC that you wrote in Section 4.1, Activity 2 toeach combination you produced in (a) to decide if it yields thezero vector.

(c) Identify which sets have the property that there is one and onlyone linear combination that yields the zero vector.

2. Write a func LI that will assume name vector space has been run;that will accept one input SETV , where SETV denotes a set of vectors;and that will return a boolean value that tells if there is a unique scalarsequence whose linear combination with SETV t yields the zero vector.Verify the construction of LI by checking each set of vectors given inActivity 1.

You will probably need to define a local variable TUPV and include aline of code such as:

TUPV := [x : x in SETV];

You may wish to use one or more of the funcs you defined in theprevious section.

3. Write a func LD that will assume name vector space has been run;that will accept one input SETV , where SETV denotes a set of vectors;that will convert SETV to a sequence with a line of code such as:

4.2 Linear Independence 169

TUPV := [x : x in SETV];

and that will return either the string “the set is independent”, if thereis a unique scalar sequence whose linear combination with the vectorsin SETV yields the zero vector, or the set of all scalar sequences thatyield the zero vector, if more than one such scalar sequence is identified.Verify the construction of LD by checking each set of vectors given inActivity 1.

4. For each set of vectors {u,v,w} you constructed in Activity 1, deter-mine whether u can be written as a linear combination of v and w;determine whether v can be written as a linear combination of u andw; and determine whether w can be written as a linear combination ofu and v. Keep track of this information in relation to the results youobtained in Activities 2 and 3.

5. Let V = (Z2)4 be as in Activity 1. Apply the func All LC, which youwrote for Section 4.1, Activity 6, to find the set of vectors generated bythe zero vector, that is, the set {〈ov〉}. What do you observe? Then,apply the funcs LI and LD to this single element set. What do youobserve?

6. Let V = (Z7)2. Apply the func All LC to find the set of vectors gen-erated by the single-vector set {〈3, 2〉}. What do you observe? Then,apply the funcs LI and LD to this set. What do you observe?

7. Let v = 〈2, 3〉 and w = 〈4, 6〉 be two vectors in R2. Solve the vectorequation

a 〈2, 3〉+ b 〈4, 6〉 = 〈0, 0〉

for a and b. How many solutions does this equation have: one? none?infinitely many? As discussed in the last section, the set of vectorsgenerated by these two vectors is given by

{s 〈2, 3〉+ t 〈4, 6〉 : s, t ∈ R}.

Given s ∈ {2,−1, 0.5} and t ∈ {3,−2}, construct all possible linearcombinations of the form

s 〈2, 3〉+ t 〈4, 6〉 ,


and use the func vectors to graph all of the resulting combinations.Based upon your graphs, describe the graph of the set of vectors gen-erated by 〈2, 3〉 and 〈4, 6〉.

8. Let v = 〈1, 2〉 and w = 〈3,−1〉 be two vectors in R2.

(a) Solve the vector equation

a 〈1, 2〉+ b 〈3,−1〉 = 〈0, 0〉

for a and b. How many solutions does this equation have: one?none? infinitely many?

(b) As discussed in the last section, the set of vectors generated bythese two vectors is given by

{s 〈1, 2〉+ t 〈3,−1〉 : s, t ∈ R}.

Given s ∈ {2,−1, 0.5} and t ∈ {3,−2}, construct all possiblelinear combinations of the form

s 〈1, 2〉+ t 〈3,−1〉 ,

and use the func vectors to graph all of the resulting combina-tions.

(c) Based upon your graphs, describe the graph of the set of vectorsgenerated by 〈1, 2〉 and 〈3,−1〉.

9. Let v = 〈1, 2, 3〉 and w = 〈2, 4, 6〉 be two vectors in R3.


a 〈1, 2, 3〉+ b 〈2, 4, 6〉 = 〈0, 0, 0〉



{s 〈1, 2, 3〉+ t 〈2, 4, 6〉 : s, t ∈ R}.



s 〈1, 2, 3〉+ t 〈2, 4, 6〉 ,

and then graph each resulting combination by hand.

(c) Based upon your graphs, describe the graph of the set of vectorsgenerated by 〈1, 2, 3〉 and 〈2, 4, 6〉.

10. Let v = 〈1, 2, 1〉 and w = 〈2,−1, 3〉 be two vectors in R3.


a 〈1, 2, 1〉+ b 〈2,−1, 3〉 = 〈0, 0, 0〉



{s 〈1, 2, 1〉+ t 〈2,−1, 3〉 : s, t ∈ R}.


s 〈1, 2, 1〉+ t 〈2,−1, 3〉 ,

and then graph each resulting combination by hand.

(c) Based upon your graphs, describe the graph of the set of vectorsgenerated by 〈1, 2, 1〉 and 〈2,−1, 3〉.

11. In the vector space Pn(K) of all polynomials of degree less than or equalto n with coefficients in the field K, consider the set of polynomials{1, x, x2, . . . , xp} where p ≤ n. Is this set linearly independent?

12. Consider the differential equation

f ′′ + f = 0,

where the unknown function f is in C∞(R).


In the previous section, you determined for which values in R of a, bthe functions given by a sin(x), b cos(x) and their linear combinationsare solutions to the differential equation. You should have decided thatall functions given by an expression of the form,

a sin(x) + b cos(x)

are solutions.

From among these solutions, pick out several linearly independent sets.What is the largest number of functions that you can have in a linearlyindependent set?

Discussion

Definition of Linear Independent and Linear Dependent

In Activity 1, you formed all possible linear combinations of the sets of vectorsSV 1, SV 2, SV 3, SV 4 and SV 5 in (Z2)4. You then applied the func LC todetermine which linear combinations yielded the zero vector. Which of thesesets have the property that there are no linear combinations that give thezero vector? Exactly one such linear combination? More than one?

A set of vectors in which only one linear combination yields the zero vectoris particular important and deserves a name: linearly independent set. Anyother set of vectors is called a linearly dependent set.

Here is a precise definition.

Definition 4.2.1. Let V be a vector space over K. A set of vectors

SV = {v1,v2,v3, . . . ,vm}

is linearly independent if there exists one and only one sequence of scalars,namely

SK = [0, 0, 0, . . . , 0]

whose linear combination yields the zero vector; that is,

0v1 + 0v2 + 0v3 + · · ·+ 0vm

is the only linear combination of SV that yields the zero vector.


In the exercises, you will be asked to formulate this definition for linearlydependent sets.

Given any set of vectors

SV = {v1,v2,v3, . . . ,vm, },

the linear combination

0v1 + 0v2 + 0v3 + · · ·+ 0vm

yields the zero vector whether the set is linearly independent or dependent.The difference between independence and dependence lies in whether thereexist linear combinations with nonzero scalars that produce the zero vector.In the case of linearly dependent sets, this is precisely the case. In the caseof linearly independent sets, the situation is the opposite: the only linearcombination that yields the zero vector is the one in which all of the scalarsare simultaneously zero.

For example, if, for each set from Activity 1, we form an arbitrary linearcombination and set it equal to the zero vector,

a 〈1, 1, 0, 1〉+ b 〈1, 0, 1, 1〉+ c 〈1, 1, 1, 0〉 = 〈0, 0, 0, 0〉 (SETV 1)

a 〈1, 1, 1, 1〉+ b 〈0, 0, 1, 1〉+ c 〈1, 1, 0, 0〉 = 〈0, 0, 0, 0〉 (SETV 2)

a 〈1, 1, 0, 1〉+ b 〈1, 0, 1, 1〉+ c 〈0, 0, 1, 1〉 = 〈0, 0, 0, 0〉 (SETV 3)

a 〈0, 0, 0, 0〉+ b 〈1, 1, 0, 0〉+ c 〈0, 0, 1, 1〉 = 〈0, 0, 0, 0〉 (SETV 4)

a 〈1, 0, 1, 0〉+ b 〈0, 1, 0, 1〉+ c 〈1, 1, 1, 1〉 = 〈0, 0, 0, 0〉 (SETV 5),

and then solve each equation for a, b, and c, we would find that the all-zeroscalars

a = 0, b = 0, c = 0

satisfies all five equations. For the sets SETV 1 and SETV 3 however, this isthe one and only combination that produces the zero vector. This is not thecase for the vectors sets SETV 2, SETV 4 and SETV 5. Each of these setshas at least one other linear combination that yields the zero vector. Arethese results consistent with what you should have found when you appliedthe funcs LI and LD to the sets SETV 1, SETV 2, SETV 3, SETV 4 andSETV 5? Setting linear combinations equal to the zero vector, as we didabove, allows us to rewrite the definition of linear independence in terms ofan equation.


Definition 4.2.2. A set of vectors

SV = {v1,v2,v3, . . . ,vm}

in a vector space is linearly independent if and only if there exists a uniquesolution to the vector equation

a1v1 + a2v2 + a3v3 + · · ·+ amvm = 0,

namely,a1 = a2 = a3 = · · · = am = 0.

In the exercises, you will be asked to formulate a similar definition forlinearly dependent, that is, not linearly independent, sets.

Although we have defined linear independence, we have not yet identifiedany characteristics that distinguish independence from dependence. One im-portant difference exists in the relationship between the vectors within thelinearly independent or dependent sets. In Activity 4, you took each setSETV 1, SETV 2, SETV 3, SETV 4 and SETV 5 and determined whethereach vector in the set could be written as a linear combination of the remain-ing vectors. What did you find in Activity 4? How do these results comparewith the linear dependence or independence of these sets? There is, in fact,a general relationship which we establish in the next theorem.

Theorem 4.2.1. Let V be a vector space over the field K. A set of vectors

SV = {v1,v2,v3, . . . ,vq}

is linearly dependent if and only if at least one of the vectors in the set canbe written as a linear combination of the remaining vectors.

Proof. (=⇒) : We will assume that SV is linearly dependent, and we willprove that at least one of the vectors in the set can be written as a combi-nation of the others. By definition, the dependence of SV implies that thereexists a set of scalars, say {c1, c2, c3, . . . , cq}, where

c1 6= 0 or c2 6= 0 or c3 6= 0 . . . or . . . cq 6= 0and

c1v1 + c2v2 + c3v3 + · · ·+ cqvq = 0.

For the purpose of this argument, it does not matter which specific scalar isassumed to be nonzero. So, let’s assume that c1 6= 0. In this case, we can


divide by c1, from which we clearly see that v1 can be expressed as linearcombination of the remaining vectors.

c1v1 + c2v2 + c3v3 + · · ·+ cqvq = 0

c2v2 + c3v3 + · · ·+ cqvq = −c1v1

−c2

c1

v2 −c3

c1

v3 − · · · −cqc1

vq = v1

(⇐=) : We will show that if at least one vector in SV can be written as alinear combination of the others, then SV must be linearly dependent. Forthe purpose of this argument, it does not matter which vector can be writtenas a combination of the others; let’s assume that v1 is such a vector. Thenthere exists a set of scalars {c2, c3, . . . , cq} such that

v1 = c2v2 + c3v3 + · · ·+ cqvq.

If we rewrite this equation, we see that

(−1)v1 + c2v2 + c3v3 + · · ·+ cqvq = 0.

Since there is a set {(−1), c2, c3, . . . , cq} of scalars, not all zero, that whencombined with SV yields the zero vector, it follows, according to the defini-tion of linear dependence, that SV is a linearly dependent set.

Another characteristic that differentiates linearly independent and depen-dent sets involves the relationship between the set and the vectors which theset generates. In Activity 1, you constructed all possible linear combinationsof each set of vectors. Did you find that for some of these five sets therewas more than one linear combination giving the same answer and for othersthere was never more than one? That is, in some cases, the representation ofany vector as a linear combination of vectors in the set is unique, in others,it is not. How did this compare with the linear independence or dependenceof the set? Again there is a general relationship which we establish in thefollowing theorem.

Theorem 4.2.2. Let V be a vector space over a field K. A set of q vectorsin V ,

SV = {v1,v2,v3, . . . ,vq}is linearly independent if and only if each vector v contained in the span ofSV can be written as a linear combination of SV in one and only one way.


Proof. (=⇒): Assume that SV is a linearly independent set, and let v be anelement of the set of vectors generated by SV . By definition, we can writev as a linear combination of SV . Let’s suppose that this can be done in twodifferent ways; that is, suppose there are two different linear combinations ofSV that represent the vector v. Using the assumption that SV is linearlyindependent, we will prove that each pair of corresponding scalars from thesetwo linear combinations consists of two equal scalars. This will prove thatv can be expressed as a linear combination of the vectors of SV in one andonly one way.

We set the two linear combinations of v equal to one another. We thensimplify the equation: we get both linear combinations on one side of theequation; we group like vectors from SV ; and then we apply the distributiveproperty for scalar multiples to each vector in SV . In its final form, theequation consists of having the zero vector on one side and a linear combi-nation of SV on the opposite side, where each vector in SV is multiplied bythe difference between the corresponding scalars from the two different linearcombinations of v. Since we are assuming that SV is linearly independent,each scalar difference is zero. From this, we see that each pair of correspond-ing scalars consists of two equal scalars. As a result, the two different linearcombinations of SV must have the same coefficients: v can be written as alinear combination of SV in one and only one way.

(⇐=): To prove the converse, we assume that any arbitrary vector v inthe set of vectors generated by SV can be written as a linear combinationof SV in one and only one way. We use this to prove that SV is a linearlyindependent set.

The set of vectors generated by SV consists of all possible linear combi-nations of SV . One such combination consists of the expression where eachscalar is zero. As a result, the zero vector must be an element of the set ofvectors generated by SV . However, because of our assumption, this is theonly linear combination of SV that yields the zero vector. According to thedefinition of linear independence, SV must be a linearly independent set.

In addition to providing insight into the differences between independentand dependent sets, either Theorem 4.2.1 or Theorem 4.2.2 could have beenused as the definition of linear independence and dependence, with Defini-tions 4.2.1 and 4.2.2 proven as theorems. In other words, the two definitionsand the two theorems are equivalent formulations of the same concept. Inthe exercises, you will be asked to use Theorem 4.2.1 to prove Definition 4.2.1


as a theorem. Your proof, taken together with the proof given above, willshow that Definition 4.2.1 and Theorem 4.2.1 are logically equivalent; thatis, one statement can be substituted for the other as the definition of linearindependence or dependence. The reason for making this point is that eachstatement focuses upon a different, yet equivalent aspect of independence ordependence. Which definition we choose to employ depends upon the specificcircumstances of the problem being posed.

Geometric Interpretation/Generating Sets

Euclidean two-dimensional space R2 contains three different types of objects:points, which are of dimension zero, lines, which are of dimension one, andthe entire plane, which is of dimension two. In Activity 5, you discoveredthat the zero vector in (Z2)4, considered as a single-vector set, is linearlydependent. All linear combinations of the zero vector yield the zero vector.This is also true in R2: the zero vector is a linearly dependent set thatgenerates a zero-dimensional object.

On the other hand, in Activity 6, you discovered that the nonzero single-vector set {〈3, 2〉} is independent and generates multiples of itself. When isa set with a single vector linearly independent? In the previous section, wefound that the single-vector set {〈3, 2〉} in R2 generated a set of vectors ofthe form

{t · 〈3, 2〉 : t ∈ R},

where the graph is given by a line whose equation is y = 23x. For single-

vector sets, notice that the set consisting of the zero vector, a dependentset, generates a set of dimension zero, while the set consisting of a nonzerovector, an independent set, generates a set of dimension one.

An analogous result holds for two-vector sets. In Activities 7 and 8, youwere asked to determine whether the given set of vectors is independent ordependent and to graph several of the vectors contained in the sets they gen-erate. Based upon your graphs, what would you say about the set of vectorsgenerated by the sets in the two activities? Is there a difference? How doesyour response compare with the linear dependence or independence of the twosets? Can you formulate a general statement about the relationship betweenthe dependence or independence of a set and the geometric representation ofthe set it generates in the plane?

As one example of all this, we can prove that the set of vectors generated


in R2 by the set{〈1, 2〉 , 〈3,−1〉},

that is,{s 〈1, 2〉+ t 〈3,−1〉 : a,∈ R},

is not a line through the origin, but is the entire plane. In order to provethis, we must show that every element of R2 can be expressed as a linearcombination of 〈1, 2〉 and 〈3,−1〉. In particular, we must show that we canalways find scalars s and t such that the vector equation

〈a, b〉 = s 〈1, 2〉+ t 〈3,−1〉

holds for any values of a and b. If we simplify the right hand side of thisequation, equate coordinates, and solve for s and t, we get

s =1

7a+

3

7b

t =2

7a− 1

7b.

Since any values of a and b can be substituted into the expressions for sand t, the vector equation given above will have a solution for all possiblechoices of a and b. Hence, every vector 〈a, b〉 in R2 can be written as a linearcombination of 〈1, 2〉 and 〈3,−1〉; the sequence {〈1, 2〉 , 〈3,−1〉} generates allof R2. In Activity 7, this is not the case. The vector equation

〈a, b〉 = s 〈2, 3〉+ t 〈4, 6〉 ,

which is equivalent to the following system of equations

2s+ 4t = a

3s+ 6t = b,

yields no solution if, for example, a = 1 and b = 2.These examples suggest that linearly independent sets in R2 span sets

whose dimension is equal to the number of vectors in the generating set,whereas linearly dependent sets generate sets whose dimension is less thanthe number of vectors in the generating set. This observation, as we willsee below, holds in R3. We are using the term dimension here somewhatinformally and without explanation. In Section 4.4 we will discuss dimensionmore thoroughly.


Like Euclidean two-dimensional space, Euclidean three-dimensional spaceR3 contains points and lines. The difference is that R3 contains more thanone plane, and the entire space itself represents a three-dimensional object.Similar to the R2 case and what you found in Activity 4, any single-vectorset consisting of the zero vector is a linearly dependent set that generatesonly the zero vector, and a single-vector set consisting of a non-zero vector isa linearly independent set that generates scalar multiples of the generatingvector. The graph of the set of vectors generated by a nonzero, single-vectorset is a line in R3. Can you show this for the set {〈2, 1, 4〉} in R3?

a

v

tv + atv

Figure 4.4: Generating a line

In Activities 9 and 10, you studied the geometric representations of thesets of vectors generated by two-vector sets. In Activity 9, you graphed dif-ferent linear combinations of the dependent set {〈1, 2, 3〉 , 〈2, 4, 6〉}. The sixcombinations you graphed lie on a line passing through the origin that isoriented in the direction of the vector 〈1, 2, 3〉. This is true for any linearcombination of this set, a result consistent with what we found in the case ofR2: namely, a linearly dependent set generates a geometric object of dimen-sion smaller than the number of elements in the set. The set in Activity 10,{〈1, 2, 1〉 , 〈2,−1, 3〉}, is linearly independent.

The six linear combinations you were asked to graph lie in the plane gen-erated by 〈1, 2, 1〉 and 〈2,−1, 3〉, a result consistent with what we discoveredin the last section and what was discussed earlier in this section. How dowe show that this sequence generates a plane? What is the equation of the


w

v

Figure 4.5: Generating a plane

plane it generates? How do we know that the set in Activity 9 does notgenerate a plane? How does one go about finding the equation of the line itdoes generate?

What about three-vector sets in R3? If we form a set of three vectors inR3, what is the graph of the set it generates? Similar to what we have seenfor one-vector and two-vector sets, a linearly dependent set of three vectorswill generate a set whose graph is of dimension zero, one, or two, while alinearly independent set of three vectors will generate the entire space R3.Although we will not consider a three-vector set here, the prior examplesillustrate an important point: a linearly dependent set of q vectors generatesa set whose graph is less than dimension q, and a linearly independent setof q vectors generates a set whose graph is of dimension q. We will revisitthis issue in Section 4.4 when we discuss the notion of dimension in a moregeneral context.


In reference to Activity 11, what does the linear dependence or independenceof the set of polynomial . . .

In reference to Activity 11, what does the linear dependence or inde-pendence of the set of polynomial functions {x −→ 1, x −→ x, x −→x2, . . . , x −→ xp}, p ≤ n in PFn(K) have to do with the fact that a polyno-mial of degree p has at most p zeros?

Turning to Activity 12, is there any connection between the degree of


the differential equation and the maximum number of linearly independentsolutions you could find?

Exercises

1. Restate Definition 4.2.1 for linearly dependent sets.

2. Restate Definition 4.2.2 for linearly dependent sets.

3. Determine whether the following sets of vectors are independent ordependent.

(a) {〈1, 1〉 , 〈−3, 2〉 , 〈−2,−1〉} in R2.

(b) {〈−1, 1,−1〉 , 〈4, 3,−2〉 , 〈1,−2, 0〉 , 〈0, 1, 2〉} in R3.

(c) {〈2, 3, 1,−2〉 , 〈−1, 3,−2, 1〉 , 〈4, 2, 1, 3〉 , 〈0, 3,−2, 1〉 , 〈2, 0,−3, 2〉}in R4.

4. Let SV = {〈2, 3, 1〉 , 〈−1, 2,−2〉 , 〈5, 4, 4〉}. Check to see whether eachvector in the set can be written as a linear combination of the remainingtwo vectors. Using only the information you get from checking thesecombinations, determine whether this set is linearly independent ordependent. Explain how you can make such a determination withoutinvoking the Definitions 4.2.1 or 4.2.2 directly.

5. Let SV = {〈2, 3, 1〉 , 〈−1, 2,−2〉 , 〈4,−3,−5〉}. Check to see whethereach vector in the set can be written as a linear combination of the re-maining two vectors. Using only the information you get from checkingthese combinations, determine whether this set is linearly independentor dependent. Explain how you can make such a determination withoutinvoking the Definitions 4.2.1 or 4.2.2 directly.

6. In this exercise, you are asked to develop the general form of a partic-ular type of linearly independent set. In parts (a)–(c), show that eachset of vectors is linearly independent. In parts (d) and (e), use thisinformation to generalize the pattern represented parts (a)–(c).

(a) {〈2, 0〉 , 〈0, 1〉} in R2.

(b) {〈3, 2, 0〉 , 〈1, 0, 2〉 , 〈0, 1, 1〉} in R3.

(c) {〈1, 1, 2, 0〉 , 〈2, 1, 0, 2〉 , 〈3, 0, 2, 1〉 , 〈0, 1, 2, 1〉} in R4


(d) Based upon the pattern given in (a)–(c), where we will assumethat the components are always non-negative, construct a linearlyindependent set of vectors in R5. Show that the set you haveconstructed is indeed independent.

(e) Describe a process for constructing a linearly independent set ofn vectors in Rn using the approach outlined above.

7. In this exercise, you are asked to develop the general form of a partic-ular type of linearly independent set. In parts (a)–(c), show that eachset of vectors is linearly independent. In parts (d) and (e), use thisinformation to generalize the pattern represented parts (a)–(c)

(a) {〈1, 0〉 , 〈0, 1〉} in R2.

(b) {〈1, 0, 0〉 , 〈0, 1, 0〉 , 〈0, 0, 1〉} in R3.

(c) {〈1, 0, 0, 0〉 , 〈0, 1, 0, 0〉 , 〈0, 0, 1, 0〉 , 〈0, 0, 0, 1〉} in R4

(d) Based upon the pattern given in (a)–(c), construct a linearly in-dependent set of vectors in R5. Show that the set you have con-structed is indeed independent.

(e) Describe a process for constructing a linearly independent set ofn vectors in Rn using the approach outlined above.

8. Let SV 1 = {〈2, 1, 2〉 , 〈3, 1, 4〉} and SV 2 = {〈1, 1, 0〉 , 〈0, 1, 0〉} be twosets of vectors in R3. Show that SV 1 and SV 2 are linearly independentsets. Describe the spans of these two sets. Do they generate the samesets of vectors? Draw a picture of R3 depicting these sets and theirspans.

9. Let SV 1 = {〈2, 1, 2〉 , 〈3, 1, 4〉} and SV 2 = {〈−2, 3, 1〉 , 〈1, 0,−2〉} betwo sets of vectors in R3. Show that SV 1 and SV 2 are linearly inde-pendent sets. Describe the spans of these two sets. Do they generatethe same sets of vectors? Draw a picture of R3 depicting these sets andtheir spans.

10. Restate Theorem 4.2.1 for linearly independent sets.

11. Restate Theorem 4.2.1 in terms of linear independence. Use this state-ment to prove Definition 4.2.1 as a theorem.

12. Restate Theorem 4.2.2 for linearly dependent sets.


13. The proof of Theorem 4.2.2 although complete, is written in a con-versational style without any calculations. An alternative would be towrite out all the steps in expressions and equations as was done in theproof of Theorem 4.2.1. Rewrite the proof of Theorem 4.2.2 in thismore computational style.

14. Let {v1,v2,v3} be a linearly dependent set. Let c be a non-zero scalar.Show that the following sets are also linearly dependent.

(a) {v1,v1 + v2,v3}

(b) {v1, cv2,v3}

15. Let {v1,v2} be a linearly independent set. If v3 cannot be written asa linear combination of v1 and v2, that is,

v3 6= av1 + bv2,

for any pair of scalars a and b, then show that {v1,v2,v3} is a linearlyindependent set.

16. Prove or provide a counterexample. If three nonzero vectors {u,v,w}are linearly dependent, it must be the case that u is a linear combina-tion of v and w.

17. Let {〈−2, 1,−1〉 , 〈4,−2, 2〉} be a two-vector set in R3. Determinewhether this set is linearly dependent or linearly independent. De-scribe the span of this set as a set of points in R3. Are your results, interms of the issue of dimension, consistent with what was discussed inthe text? Explain your answer.

18. Let {〈3,−4, 1〉 , 〈2, 5,−3〉} be a two-vector set in R3. Decide whetherthis set is linearly dependent or linearly independent. Describe thespan of this set as a set of points in R3. Are your results, in terms ofthe issue of dimension, consistent with what was discussed in the text?Explain your answer.

19. Let {〈−4, 1〉 , 〈2, 5〉} be a two vector set in R2. Determine whether thisset generates the entire plane R2.


20. Let {〈−1, 4, 3〉 , 〈3, 5, 2〉 , 〈1,−1, 3〉} be a three-vector set in R3. Deter-mine whether this set generates the entire space R3 using an approachsimilar to that given for generating sets of R2. Is this set linearly in-dependent or linearly dependent? Is your result consistent with thediscussion given in the text? Explain.

21. Let {〈1, 4, 2〉 , 〈−1, 3,−1〉 , 〈2, 1, 3〉} be a set of three vectors in R3. De-termine whether the first vector can be written as a linear combinationof the remaining vectors. Repeat this process for the second and thirdvectors. Without making any further calculations, is this set linearlyindependent or linearly dependent? What can you say about the dimen-sion of the graph of the set of vectors generated by this set? Carefullyexplain your answer.

22. Prove that any set of monomial functions in PFn(R) is linearly inde-pendent.

23. Which sets of monomial functions in PFn(R) span all of PFn(R)?

24. Discuss the results of Exercises 22 and 23 if R is replaced by Z3.

185

4.3 Generating Sets and Linear Independence

Activities

1. Let

SETV 1 = {〈2, 1, 3, 2〉 , 〈1, 1, 3, 1〉 , 〈3, 2, 2, 3〉}SETV 2 = {〈3, 2, 1, 3〉 , 〈4, 3, 0, 4〉 , 〈2, 2, 1, 2〉}

be two sets of vectors on (Z5)4.

(a) Use the func LI that you wrote in Section 4.2, Activity 2 to verifythat both sets are linearly independent.

(b) Apply the func All LC from Section 4.1, Activity 6 to find theset of vectors generated by each set. What do you observe? Arethe two spans equal?

(c) Apply the modified version of the func LU you constructed in Sec-tion 4.1, Exercise 8 to determine whether each vector in SETV 2can be written as a linear combination of the vectors in SETV 1.What do you observe?

2. Let

SETV 1 = {〈2, 1, 3, 2〉 , 〈1, 1, 3, 1〉 , 〈3, 2, 2, 3〉}SETV 3 = {〈1, 2, 0, 2〉 , 〈3, 1, 1, 2〉 , 〈0, 3, 0, 0〉}

be two sets of vectors on (Z5)4.

(a) Use the func LI to verify that both sets are linearly independent.

(b) Apply the func All LC to find the set of vectors generated by eachset. What do you observe? Are the two sets equal?

(c) Apply the modified version of the func LU you constructed inSection 4.1, Exercise 8 of the section on linear combinations todetermine whether each vector in SETV 3 can be written as a lin-ear combination of the vectors in SETV 1. What do you observe?


(d) The results from this and the prior exercise illustrate a generalprinciple. Formulate a conjecture based upon your findings.

3. Let

SETV 4 = {〈1, 2, 3, 4〉 , 〈3, 3, 3, 2〉 , 〈2, 1, 0, 3〉 , 〈3, 2, 1, 2〉}

be a set of vectors in (Z5)4. Perform the following tasks with respectto this set of vectors.

(a) Apply the func All LC to find the span of this set.

(b) Apply the func LI to determine whether the set SETV 4 is inde-pendent or dependent.

(c) Apply the modified version of the func LU to determine whetherany vector in the set can be written as a linear combination of theremaining vectors.

(d) If the answer to (c) is yes, remove one such vector, and denotethe remaining sequence as SETV 5. Repeat steps (a), (b), and(c) with SETV 5. Is the span of SETV 5 the same as the spanof SETV 4? Is SETV 5 linearly independent? If the answer topart (c) is yes, repeat the process: remove one of the vectors thatcan be written as a linear combination of the remaining vectors,and denote the new sequence as SETV 6. Repeat (a), (b), and (c)with SETV 6 and beyond that, if necessary, until you arrive at ananswer of no in part (c)

(e) When you get a no answer in part (d), what is your answer topart (c)? Is there a relationship between whether a set is inde-pendent and whether a vector in that set can be written as a linearcombination of the others? Does the “final” set you get generatethe same set of vectors as the original set SETV 4? Explain youranswer.

4. Write a func LIGS that assumes that name vector space has been ex-ecuted; that accepts one input SETV , where SETV is a set of vectors;and that returns a linearly independent set constructed by employingthe following process: the func takes one of the vectors in SETV , testswhether it is a combination of the others, removes the vector from theset if it is, leaves the vector in the set if it is not, and successively

4.3 Generating Sets and Linear Independence 187

repeats this process until each vector in the set has been checked. Testthe func LIGS on the set SETV 4 that you worked with in Activity 3.Does LIGS return the same set you got after having completed parts(a)–(d) of Activity 3?

5. Let SETV = {〈1, 1, 0〉 , 〈0, 1, 1〉 , 〈1, 0, 0〉} be a set of vectors in (Z2)3.Verify that SETV is linearly independent. Show that SETV generatesthe entire set of vectors (Z2)3. Select a vector v different from 〈1, 1, 0〉,〈0, 1, 1〉, and 〈1, 0, 0〉, and form the new set

{〈1, 1, 0〉 , 〈0, 1, 1〉 , 〈1, 0, 0〉 ,v}.

Test whether the resulting set is independent. What do you observe?Repeat this for every possible choice of v that is not equal to 〈1, 1, 0〉,〈0, 1, 1〉, 〈1, 0, 1〉. What do you observe?

6. Let

SETV = {〈2, 3, 4, 1, 1〉 , 〈1, 1, 3, 1, 2〉 , 〈3, 4, 2, 2, 3〉 ,〈4, 0, 0, 3, 0〉 , 〈1, 4, 2, 3, 3〉}

be a set of vectors in (Z5)5.

In parts (b)–(d) below, you can use the predefined ISETL func npow

to construct the set of subsets with a given cardinality.

(a) Use the func LI to show that SETV is linearly dependent.

(b) Construct all subsets of SETV that consist of two vectors. ApplyLI to each set. What do you observe?

(c) Construct all subsets of SETV that consist of three vectors. Ap-ply LI to each set. What do you observe?

(d) Construct all subsets of SETV that consist of four vectors. ApplyLI to each set. What do you observe?

7. Let V = (Z5)3. Apply the func LI to each part (a)–(d) Discuss yourfindings: in particular, compare your result for part (d) with what youget for parts (a)–(c)

(a) {〈4, 4, 2〉 , 〈2, 1, 3〉}


(b) {〈4, 4, 2〉 , 〈3, 1, 3〉}(c) {〈2, 1, 3〉 , 〈3, 1, 3〉}(d) {〈4, 4, 2〉 , 〈2, 1, 3〉 , 〈3, 1, 3〉}

8. Use the func LI to determine whether each of the following sets isindependent or dependent.

(a) {〈2, 1, 0〉 , 〈1, 1, 1〉}; {〈2, 1, 0〉 , 〈1, 1, 1〉 , 〈0, 0, 0〉} in (Z3)3

(b) {〈2, 3, 1, 4〉 , 〈3, 3, 2, 1〉 , 〈1, 2, 1, 4〉};{〈2, 3, 1, 4〉 , 〈3, 3, 2, 1〉 , 〈1, 2, 1, 4〉 , 〈0, 0, 0, 0〉} in (Z5)4

What do you think is the point of this activity? Explain.

Discussion

In Section 4.1, generating sets were defined and in Section 4.2, the con-cepts of linear independence and linear dependence were defined. In thissection, we will study the relationship between generating sets and the setsof vectors they generate; discuss how to construct a linearly independentset when given any set of vectors; present various special forms of linearlyindependent sets of vectors; and prove important properties of linearly inde-pendent and dependent sets.

Generating Sets and Their Spans

In Activity 1, you were given two sets of vectors in (Z5)4. You were asked tofind and compare the sets of vectors generated by the two and to determinethe relationship, in terms of linear combinations, between the generating sets.In Activity 2, one of the sets was changed and you did the same thing withthe resulting pair of sets.

What were the various phenomena that you observed? See how long a listof observations you can make. For each observation formulate a statementthat says that what you observed is true in general. Is your “theorem”correct? Try to supply a proof or a counterexample as appropriate.

One of your observations might have been an important relationship be-tween the sets generated by two sets of generators and the possibility of


writing every generator in one set as a linear combination of the genera-tors in the other set. This relationship is formalized and proved in the nexttheorem.

Theorem 4.3.1. Two sets of vectors

SV 1 = {v1,v2, . . . ,vq}SV 2 = {w1,w2, . . . ,wq}

in a vector space V generate the same set of vectors in V if and only if eachvector in SV 2 can be written as a linear combination of the vectors in SV 1,and vice-versa.

Proof. (=⇒:) Let us assume that SV 1 and SV 2 generate the same set ofvectors. We will then prove that each vector in SV 2 can be written as alinear combination of the vectors in SV 1. If u is an element of the span ofSV 2, then u can be written as a linear combination of the vectors w1, w2,. . . , and wq. Now, each vector wi, i = 1, 2, · · · , q, in SV 2 is an element ofthe span of SV 2 and since SV 1 and SV 2 are assumed to generate the sameset of vectors, so it follows that each wi, i = 1, 2, · · · , q is in the span of SV 1,which means that that each vector wi for i = 1, 2, · · · , q can be written as alinear combination of the elements of SV 1.

In a similar manner, we can show that each vector in SV 1 can be writtenas a linear combination of the vectors in SV 2.

(⇐=:) We will assume that each vector in SV 2 can be written as a combina-tion of the vectors in SV 1, and vice-versa. We will then prove that SV 1 andSV 2 generate the same set of vectors. Let u be a vector in the set generatedby SV 2. Then, there exist scalars a1, a2, . . . , aq such that

u = a1w1 + a2w2 + · · ·+ aqwq.

Since each vector in SV 2 can be written as a combination of SV 1, there existsequences of scalars

[b11, b12, . . . , b1q][b21, b22, . . . , b2q]

...[bq1, bq2, . . . , bqq]


such that

w1 = b11v1 + b12v2 + · · ·+ b1qvq

w2 = b21v1 + b22v2 + · · ·+ b2qvq...

wq = bq1v1 + bq2v2 + · · ·+ bqqvq.

Substituting these expressions for the wi in the expression for u, we get

u = a1w1 + a2w2 + · · ·+ aqwq

a1[b11v1 + b12v2 + · · ·+ b1qvq] +

a2[b21v1 + b22v2 + · · ·+ b2qvq] + · · ·+aq[bq1v1 + bq2v2 + · · ·+ bqqvq]

= [a1b11 + a2b21 + · · ·+ aqbq1]v1 +

[a1b12 + a2b22 + · · ·+ aqbq2]v2 + · · ·+[a1b1q + a2b2q + · · ·+ aqbqq]vq,

a linear combination of SV 1. Hence, u is an element of the set of vectorsgenerated by SV 1. Since u was chosen arbitrarily, each vector in the setgenerated by SV 2 is also generated by SV 1. In a similar fashion, we canshow that every vector contained in the set generated by SV 1 is generatedby SV 2. As a result, SV 1 and SV 2 generate the same set of vectors.

Constructing Linearly Independent Generating Sets

In Activity 3, you removed vectors until you ended up with a linearly inde-pendent set. Every time you removed a vector, you applied All LC to thenew set. What was the relationship between the span of the new set andthe span of its predecessor? In constructing a new generating set, were youallowed to remove just any vector?

How do your responses to these questions relate to the following theo-rem? In the last section, we proved that an important property of linearlydependent sets, one not shared by independent sets, is that at least one of itsvectors can be written as a linear combination of the other vectors in the set.Since the set of vectors generated by a set consists of linear combinations ofthe generators, it seems reasonable that any generator that can be writtenas a combination of the other generators is redundant, a fact consistent withwhat you found in Activity 3 and one which we shall now prove in general.


Theorem 4.3.2. If a set of vectors

SV = {v1,v2,v3, . . . ,vq}

in a vector space is linearly dependent, and if one of the vectors, say v1, isa linear combination of the vectors in SV ∗ = {v2,v3, . . . ,vq}, then SV andSV ∗ generate the same set of vectors.

Proof. We can simply apply Theorem 4.3.1 with SV 1 = SV and SV 2 =SV ∗.

Think about this theorem in relation to Activity 4. You wrote a func LIGSthat removed vectors from a linearly dependent set until it either became alinearly independent set or became empty. What does the theorem tell youabout the set of vectors generated by the resulting linearly independent set?

This theorem gives us a means by which we can construct a linearlyindependent generating set whenever we are given a generating set: we simplyremove vectors which can be written as linear combinations and continueuntil this is no longer possible. A linearly independent generating set iscalled a basis, and this will be the topic of the next section.

How about going the other way? What happens if we have a set S ofvectors and add to it a vector which is already a linear combination of thevectors in S? How does the set of vectors generated by the new set comparewith the set of vectors generated by S?

Properties of Linear Independence and Linear Depen-dence

In the last section, in addition to defining independence and dependence,we stated and proved two equivalent conditions of linear independence: inparticular, a set of vectors is linearly independent if and only if one of thefollowing properties holds:

• no vector in the set can be written as a linear combination of theremaining vectors; and

• any vector generated by the set can be expressed as a linear combinationof elements of the generating set in one and only one way.


In this subsection, we will prove two necessary conditions for independence.In Activity 6, you constructed all proper subsets (sets with one or more of theoriginal elements missing) of two, three, and four vectors of the dependentset

SETV = {〈2, 3, 4, 1, 1〉 , 〈1, 1, 3, 1, 2〉 , 〈3, 4, 2, 2, 3〉 ,〈4, 0, 0, 3, 0〉 , 〈1, 4, 2, 3, 3〉}.

You then applied the func LI to each subset to determine which subsets wereindependent and which were dependent. In Activity 7 you looked at somemore examples. What did you find? What do these activities tell you aboutthe linear dependence or independence of a subset of a linearly dependentset? Does Activity 8 suggest anything about that? The following theoremaddresses these questions for subsets of a linearly independent set.

What about subsets of a linearly independent set?

Theorem 4.3.3. If S is a linearly independent set in a vector space V , thenevery proper subset of S must also be linearly independent.

Proof. The proof is left to the exercise section.

Is there a similar result for linearly dependent sets? In particular, mustevery subset of a dependent set be dependent? Is it possible for a linearlydependent set to have both subsets which are dependent and subsets whichare independent?

How about going the other way? Suppose you took a set of vectors andadded some vectors. What could you say about the larger set if the smallerset was linearly independent? Linearly dependent?

In each part of Activity 7, you were given two sets of vectors: one whichwas an independent set, and the second which was the same set with thezero vector added. In both cases, the set with the zero vector was dependent.Before looking at the following theorem, think about what might be true ingeneral.

Theorem 4.3.4. If SV is linearly independent set of vectors in in a vectorspace V , then SV cannot contain the zero vector.

Proof. The proof is left to the exercise section.


Here is a “trick” question. Suppose you began with a set that was lin-early dependent and began removing vectors. Can you be sure that you willeventually obtain a linearly independent set? This is one of those statementsthat is false but, really, it is true. What could that mean?


In the vector space PFn(K) what would you say is the set generated by theset of vectors {x −→ 1, x −→ x, . . . , x −→ xp} where p ≤ n?

In the vector space C∞(R), what relation does the subspace generated bythe functions sin, cos have to the set of solutions of the differential equation,

f ′′ + f = 0?

Exercises

1. Let SV = {(2, 1, 3,−1), (−2, 1, 1,−3), (3, 4,−1, 1)} be a linearly inde-pendent set of vectors in R4. Construct a second linearly independentset that spans the same set as SV . Verify that the spans of these twosets are the same.

2. Let SV be the same set as that given in 1. Construct a second linearlyindependent set that spans a different set of vectors from that spannedby SV . Verify that the spans of these two sets are indeed different.

3. Let

SV = {〈−1,−2, 3, 2〉 , 〈2, 3, 3,−1〉 , 〈1, 1, 6, 1〉 , 〈−4,−5,−15,−1〉}

be a set of vectors in R4. Show that this set is linearly dependent.Construct a linearly independent set by removing dependent vectors.Show that the two sets generate the same set of vectors. What wouldyou say is the dimension of the set of vectors generated by this set?

4. LetSV = {〈−3, 4, 1〉 , 〈1, 1,−2〉 , 〈−5, 2, 5〉}

be a set of vectors in R3. Show that this set is linearly dependent.Construct a linearly independent set by removing dependent vectors.


Show that the two sets generate the same set of vectors. What wouldyou say is the dimension of the set of vectors generated by this set?Explain.

5. Let V = (K)n be the vector space of all n-tuples whose componentsconsist of elements of K. Prove that the set

{〈1, 1, 1, . . . , 1〉 , 〈1, 1, 1, . . . , 1, 0〉 , 〈1, 1, 1, . . . , 0, 0〉 ,· · · , 〈1, 1, 0, . . . , 0〉 , · · · , 〈1, 0, 0, . . . , 0, 0〉}

is linearly independent.

6. Let V = (K)n be the vector space of all n-tuples whose componentsconsist of elements of K. Prove that the set

{〈1, 0, 0, · · · , 0〉 , 〈0, 1, 0, · · · , 0〉 , 〈0, 0, 1, · · · , 0〉 , · · · , 〈0, 0, 0, · · · , 0, 1〉}


7. Provide the proof of Theorem 4.3.3.


9. Without performing any calculations, explain why each of the followingsets in R3 is linearly dependent.

(a) {〈1, 2, 3〉 , 〈0, 0, 0〉 , 〈2, 1, 3〉}(b) {〈1,−1, 2〉 , 〈3, 4, 1〉 , 〈1, 1, 0〉 , 〈2, 1, 3〉}(c) {〈3, 5, 1〉 , 〈3, 1, 2〉 ,v}, where v = a 〈3, 5, 1〉 + b 〈3, 1, 2〉 for some

scalars a and b

(d) {v1,v2,v}, where {v1,v2} is a linearly dependent set, but thereexist no scalars a and b such that v = av1 + bv2.

10. For each part of the parts (a)–(c) below, find a linearly independentset that generates the same set as the set given. Then, show that theresulting linearly independent set generates all of Rn. Try to completeeach part without making specific computations; in short, use the the-orems proven and the concepts discussed in this and the prior sectionto justify your claims.


(a) {(2, 1), (1, 3), (1, 1)}, n = 2

(b) {(1, 2, 3), (3, 1, 2), (1, 4, 3), (2, 4,−1)}, n = 3

(c) {(2, 0, 1, 0), (0, 1, 3, 1), (1, 2, 1, 1), (3, 2, 1, 4), (1, 1,−1, 2)}, n = 4

11. Construct a linearly dependent set in R3 whose proper subsets are alllinearly independent.

12. Given two sets in R3, say

SV 1 = {v1,v2,v3}SV 2 = {w1,w2,w3},

where w1 and w2 are both linear combinations of SV 1 but w3 is not,determine whether SV 1 and SV 2 generate the same set of vectors.Justify your answer. If necessary, construct two sets satisfying thegiven conditions to illustrate your point.

13. Let {〈2, 1, 3− 2〉 , 〈1,−1,−3, 4〉} be a set of vectors in R4.

(a) This set is linearly independent. Why?

(b) Find two more vectors v and w so that the resulting expanded set

{〈2, 1, 3− 2〉 , 〈1,−1,−3, 4〉 ,v,w}


(c) Once you have found two such vectors, show that the set you haveconstructed is indeed independent.

(d) Based upon the discussion regarding dimension given in this andat the end of the last section, make a determination as to whetherthe set you have constructed generates all of R4.

14. Let

S = {v1,v2,v3,v4}

be a set in V = (K)3.

(a) Is S linearly independent or linearly dependent? Explain youranswer.


(b) If every subset of S consisting of three vectors is linearly inde-pendent, how could we construct a linearly independent set thatgenerates all of V ?

(c) If every subset consisting of three vectors is linearly independent,must it follow that every subset of two vectors must be linearlyindependent? Justify your answer.

15. Suppose w1 and w2 are two vectors that are combinations of v1 andv2 such that

w1 = av1 + bv2

w2 = cv1 + dv2

a, b, c, d 6= 0a

c6= b

d.

Is the set {w1,w2} linearly independent or linearly dependent? Care-fully justify your answer. Do the two sets {v1,v2} and {w1,w2} gen-erate the same set? Explain.

16. Suppose you take a linearly dependent set and remove vectors one byone. Can you always be sure that you eventually obtain an independentset?

17. In the vector space PFn(K), describe the set generated by the vectors{x −→ 1, x −→ x, . . . , x −→ xp} where p ≤ n? What about theset generated by {x −→ x, . . . , x −→ xp} or the set generated by themonomial functions with odd exponents?

18. In the vector space PFn(K), the set {x −→ 1, . . . , x −→ xn−1} islinearly independent. Based on this, what can you say about aboutthe sets {x −→ 1, . . . , x −→ xp} where p < n? What about the sets{x −→ x, . . . , x −→ xp}, p < n? or the set of all monomials with oddexponents? Even?

19. In the vector space C∞(R), what relation does the subspace generatedby the functions sin, cos have to the set of solutions of the differentialequation,

f ′′ + f = 0?

197

4.4 Bases and Dimension

Activities

1. In Section 4.1, Activity 2, you wrote the func LC that accepted asequence of scalars SK and a sequence of vectors SV and returned thevector that was the linear combination of the vectors in SV using thescalars in SK. There is an ISETL operation called % that can be appliedto .va to make forming linear combinations easy. Here is a version ofLC that uses this feature. It is assumed that name vector space hasbeen run.

LC := func(SK, SV);

return %.va[SK(i) .sm SV(i) : i in [1..#SK]];

end;

(a) Describe in words how the operation %.va works.

(b) Let V = (Z5)3, run name vector space, define SK and SV andrun the line

%.va[SK(i) .sm SV(i) : i in [1..#SK]];

on the following two examples.

i. SK1 = [1, 4, 3], SV 1 = [〈1, 3, 1〉], 〈2, 1, 4〉 , 〈4, 0, 2〉]ii. SK2 = [3, 0, 3], SV 2 = [〈3, 4, 4〉], 〈2, 1, 0〉 , 〈3, 3, 3〉]

(c) Based on your experience with %, predict what ISETL will do witheach of the following three lines of code do, run them and explainany relationship between the expressions.

i. LC(SK1,SV1); LC(SK2,SV2);

ii. LC(SK1,SV1).va LC(SK2,SV2);

iii. LC(SK1 + SK2, SV1 + SV2);

Note: The ISETL operation + here will concatenate the twotuples.

2. Let SETV = {〈1, 1, 0〉 , 〈0, 1, 1〉} be a set of vectors in (Z2)3.

(a) Verify that SETV is linearly independent.


(b) Show that SETV does not generate the entire set of vectors (Z2)3.

(c) Select a vector v different from 〈1, 1, 0〉 and 〈0, 1, 1〉 and form thenew set {〈1, 1, 0〉 , 〈0, 1, 1〉 ,v}. Test whether the resulting set isindependent. What do you observe?

(d) Repeat this for every possible choice of v that is not equal to〈1, 1, 0〉 and 〈0, 1, 1〉. What do you observe? Explain.

3. For each vector space V and set of vectors SETV listed below, applythe func LI from Section 4.2, Activity 2 to determine if the set islinearly independent and the func All LC from Section 4.1, Activity 6to determine if the set of vectors generated by SETV is equal to V .

(a) V = (Z2)5

SETV 1 = {〈1, 1, 1, 1, 1〉 , 〈0, 1, 1, 1, 1〉 , 〈0, 0, 1, 1, 1〉 ,〈0, 0, 0, 1, 1〉 , 〈0, 0, 0, 0, 1〉}

SETV 2 = {〈0, 1, 1, 1, 1〉 , 〈1, 0, 1, 1, 1〉 , 〈1, 1, 0, 1, 1〉 , 〈1, 1, 1, 0, 1〉}SETV 3 = {〈1, 1, 1, 1, 1〉 , 〈0, 0, 1, 1, 1〉 , 〈0, 0, 0, 0, 1〉 ,

〈1, 1, 0, 0, 1〉 , 〈1, 0, 0, 0, 1〉}(b) V = (Z3)4

SETV 4 = {〈1, 2, 2, 1〉 , 〈0, 1, 2, 1〉 , 〈0, 0, 2, 1〉}SETV 5 = {〈0, 1, 2, 1〉 , 〈2, 0, 2, 2〉 , 〈1, 2, 0, 1〉 , 〈2, 2, 2, 0〉}SETV 6 = {〈1, 0, 1, 1〉 , 〈0, 0, 1, 2〉 , 〈0, 0, 0, 2〉 , 〈1, 1, 2, 0〉}SETV 7 = {〈1, 1, 1, 1〉 , 〈0, 0, 1, 2〉 , 〈0, 0, 0, 2〉 ,

〈1, 1, 2, 0〉 , 〈1, 0, 0, 0〉}

4. Write a func is basis that assumes name vector space has been run;that accepts a set of vectors and a vector space V ; that identifies theset as being linearly dependent or independent; and that determineswhether the span of the set is V . In your func, you may want to ruleout the empty set right away.

Test your func on all of the examples in Activity 3.

5. Let Pn(K) be the vector space of all polynomials of degree less than orequal to n with coefficients in the field K. Set up appropriate repre-sentations and apply is basis to solve the following problems.

4.4 Bases and Dimension 199

(a) Determine if the set of monomials {1, x, x2, x3, x4} is a basis forP4(Z3).

(b) Find a basis for P4(Z3) as different from the monomials as youcan.

(c) Find an example of a linearly independent set and a set whichgenerates the entire vector space in which neither are bases forP4(Z3)

6. For each of the following system of linear equations with coefficients inZ5, use the func Three eqn that you wrote for Activity 9 in Section3.1 to find the solution set. Run name vector space on this subspaceof Z5. Then, find a linearly independent set whose span is the solutionset. Apply the func is basis to this linearly independent set.

x1 + x2 + 4x3 = 02x1 + 2x2 + 3x3 = 03x1 + 3x2 + 2x3 = 0.

7. If V is a vector space, and B is set of vectors for which is basis returnstrue, then we know that each vector, V , can be written uniquely as alinear combination of the vectors in B. This raises the problem, givenB and a vector v, of finding the coefficients in that linear combination.

(a) Explain why the above statement about unique linear combina-tions is true.

(b) Pick a set from Activity 3, part (b) for which is basis returnstrue. Find the coefficients for the linear combination of the vectorsin this set that is equal to 〈0, 1, 2, 0〉.

8. Consider the following func which assumes that name vector space

has been run and accepts a vector space V .

Make_Basis := func(V);

local select;

select := func(SETV,W);

if is_basis(SETV) then

return SETV;


else

SETV := SETV with arb(W);

W := W - All_LC(SETV);

return select(SETV,W);

end;

end;

return select({},V less ov);

end;

(a) Explain in words what Make Basis does.

(b) Pick a vector space V and apply Make Basis three times to V .How are your three results related to each other? Are they thesame? Do they have the same number of elements?

(c) Apply is basis to each of the three sets of vectors returned bythe three applications of Make Basis. Describe what happens.

Discussion

Summation Notation

A very important tool for writing expressions in linear algebra is summationnotation, that is, expressions such as,

n∑i=1

tivi

where t1, t2, . . . , tn is a sequence of scalars and v1,v2, . . . ,vn is a sequence ofvectors. If you wrote this summation expression out without the

∑symbol,

what would you get?The operator %.va in ISETL works exactly like the

∑symbol in mathe-

matics. Following are some summation notations that mean the same thingin mathematics as do the ISETL expressions in Activity 1. See if you canmatch them, expression for expression. In particular, what has replaced SK1,SK2, SV1, SV2?

n∑i=1

tivi


3∑i=1

aiui

3∑i=1

biwi

3∑i=1

aiui +3∑i=1

biwi

6∑i=1

civi

One thing you can do to figure out such expressions is to write them outin full detail without any summation or % symbols. Thus we have,

4∑i=1

tivi = t1v1 + t2v2 + t3v3 + t4v4

andn∑i=1

tivi = t1v1 + t2v2 + · · ·+ tnvn

You can factor a scalar out of an expression. Thus if all of the ti in∑ni=1 tivi were equal to t, what would

∑ni=1 tvi be equal to? If you are in

doubt, choose a value for n and write everything out.You can also add two summation expressions termwise as in,

n∑i=1

(ai + bi) =n∑i=1

ai +n∑i=1

bi

Again, if you need clarification, choose a value for n and write everythingout.

Things can get very complicated if you have multi-indices, or sequencesof sequences, as with matrices. Thus if (aij) where i = 1, 2, . . . ,m andj = 1, 2, . . . , n is a matrix or doubly indexed sequence of scalars, we have,

∑i,j

aij =m∑i=1

n∑j=1

aij

Once more, choose a value for n and write everything out to help un-derstand what these expressions mean. You will have an opportunity in theexercises to practice with these symbols.


Bases

Here is a mathematical formulation of the concept expressed in ISETL by thefunc is basis that you wrote in Activity 4.

Definition 4.4.1. A non-empty set B = {v1,v2, . . . ,vn} in a vector spaceV is called a basis for V if every vector v ∈ V can be written in one and onlyone way as a linear combination of the vectors in B.

In Activity 3, you considered, altogether, seven sets of vectors in twovector spaces. Which of these are bases for the vector spaces containingthem?

You should have no difficulty showing that this definition is exactly thesame as saying that the set is linearly independent and generates all of V .

Certain sets of vectors having a particular form will always be bases. Forexample, for any vector space V = (K)n, the following sets of vectors arebases:

B1 = {〈1, 1, 1, . . . , 1〉 , 〈0, 1, 1, . . . , 1〉 , 〈0, 0, 1, . . . , 1〉 ,〈0, 0, 0, 1, . . . , 1〉 , . . . , 〈0, 0, 0, · · · , 0, 1〉}

B2 = {〈1, 0, 0, . . . , 0〉 , 〈0, 1, 0, . . . , 0〉 , 〈0, 0, 1, . . . , 0〉 ,. . . , 〈0, 0, 0, . . . , 0, 1〉}

B3 = {〈1, 1, 1, . . . , 1〉 , 〈1, 1, . . . , 1, 0〉 , 〈1, 1, . . . , 1, 0, 0〉 ,. . . , 〈1, 1, 0, . . . , 0〉 , 〈1, 0, 0, . . . , 0〉}

We prove the first case here. The latter cases are left for the exercises.

Theorem 4.4.1. Let V = (K)n. The set B1 is a basis.

Proof. First we show that the set is linearly independent.Given the vector equation

a1 〈1, 1, 1, . . . , 1〉+ a2 〈0, 1, 1, . . . , 1〉+a3 〈0, 0, 1, . . . , 1〉+ · · ·+ an 〈0, 0, 0, · · · , 0, 1〉

= 〈0, 0, 0, . . . , 0〉 ,


we must show that to

a1 = a2 = a3 = · · · = an = 0.

We have,

〈0, 0, 0, . . . , 0〉 = a1 〈1, 1, 1, . . . , 1〉+ a2 〈0, 1, 1, . . . , 1〉+a3 〈0, 0, 1, . . . , 1〉+ · · ·+ an 〈0, 0, 0, . . . , 0, 1〉= 〈a1, a1, a1, . . . , a1〉+ 〈0, a2, a2, . . . , a2〉+〈0, 0, a3, . . . , a3〉+ · · ·+ 〈0, 0, 0, . . . , 0, an〉

= 〈a1, (a1 + a2), (a1 + a2 + a3), . . . , (a1 + a2 + a3 + · · ·+ an)〉

Therefore,

a1 = 0

a1 + a2 = 0

a1 + a2 + a3 = 0...

a1 + a2 + a3 + · · ·+ an−1 = 0

a1 + a2 + a3 + · · ·+ an−1 + an = 0.

The first equation yields a1 = 0. Substituting this into the second equationforces a2 = 0. Substituting these results into the third equation results ina3 = 0. If we continue with subsequent steps, we get the desired result; thatis,

a1 = a2 = a3 = · · · = an = 0.

Next we show that B1 generates all of V . Let v1,v2, . . . ,vn be the vectorsin B1. We must show that given any sequence c1, c2, . . . , cn of scalars in K,we can find a sequence of scalars a1, a2, . . . , an in K such that,

n∑i=1

aivi = c

where c = 〈c1, c2, . . . , cn〉.When this expression is written out with all of the coordinates, you get

almost exactly the same set of equations that was obtained in showing linear


independence. The only difference is that each 0 on the right hand side isreplaced by the appropriate ci. That is, we must solve the following systemof equations for the unknowns a1, a2, . . . , an:

a1 = c1

a1 + a2 = c2

a1 + a2 + a3 = c3

...

a1 + a2 + a3 + · · ·+ an−1 = cn−1

a1 + a2 + a3 + · · ·+ an−1 + an = cn.

Clearly, we get a solution by taking a1 = c1, a2 = c2 − c1, a3 = c3 − 2c1 − c2

and so on.

The set of vectors, B2 is particularly important. We call it the coordinatebasis and write its elements as e1, e2, . . . , en. The vector ei has all of itscoordinates equal to 0 except for the ith coordinate which is 1.

You will notice that we have defined a basis to be a set of vectors asopposed to a sequence of vectors. The reason for this is that the prop-erty of being a basis does not depend on the order in which the vectors areconsidered—at least not in the context with which this course is concerned.In many situations where bases are used, however, it becomes important tofix the order of the elements. Is the case for Activity 7? Do the coefficientsyou find form a set or a sequence? This issue will definitely come up in thenext few paragraphs.

When we want to make use of the order of the elements of a basis, wemake the set into a sequence and call it an ordered basis. Thus, given anybasis, each ordering of the set produces a different ordered basis. If a basishas 10 vectors, how many ordered bases can you get from it?

Nobody is perfect and the difference between a basis and an ordered basiscan be so small, that often we will forget to add the adjective “ordered” whenwe should. But you can always tell from the context, so whenever we areworking with a basis as a sequence, we mean and ordered basis whether wesay so or not.


Expansion of a Vector with respect to a Basis.

In Activity 7, you considered the problem of given a vector space V , a basisB for V and a vector v ∈ V , how can we find the coefficients of v in itsexpansion as a linear combination of the vectors in B. There are severalways of doing this. You might set up a system of linear equations and solvethem. You could do this by hand, or use a computer tool. You could also doit (if the vector space is not too large) by using the ISETL operation choose.That is, you would apply choose to the set of all linear combinations withthe condition that it be equal to the given vector. In Chapter 6, you will findanother method that uses matrices.

When determining these coefficients, you really have to make sure thateach coefficient goes with a specific vector. One way of doing this would beto use an ordered basis. Then you find a sequence of scalars to form thecoefficients and the order takes care of the matching automatically.

There is one case in which finding the coefficients is so easy that it mightseem trivial. Suppose V = Kn and you are working with the coordinatebasis. Now, any vector v ∈ V has both its components as an element ofKn and its coefficients in its expansion by the n basis elements. What canyou say about these two sequences of scalars? Although this fact may seemtrivial, it is not. We very briefly pursue it in a more general form.

Representation of a vector space as Kn. Suppose you have an arbitraryvector space V over a field, K, and an ordered basis B = [b1,b2, . . . ,bn].Then any vector v ∈ V has a sequence of scalars t = (ti) = (t1, t2, . . . , tn)which are its coefficients with respect to B. That is,

v =∑i

tibi = t1b1 + t2b2 + · · ·+ tnbn.

How does this compare with what you obtained in Activity 7?Because of properties of bases, this representation is unique so you can

think of v as an element t = (ti) of Kn. Conversely, if you have an elementt = (ti) ∈ Kn, then the same equation can be used to specify a vector v ∈ Kn.It is easy to see that the operations of vector addition and multiplication arepreserved by this correspondence. What exactly is meant by “preserved”here?

These comments can be summarized by saying that the vector space Vis “the same” as Kn. This being the same, however, depends on the basis B.


This observation is very important in more advanced studies of linearalgebra. We will not pursue it in this text.

In this representation of a vector space as Kn,would you say that theorder of the basis makes a difference?

Finding a Basis

You can always find a basis. In the case in which everything is finite, Ac-tivity 8 produces a basis. How do you know that the func Make Basis willalways work? The algorithm begins by selecting an arbitrary non-zero vectorin V , forms the subspace generated, picks a non-zero vector not in that sub-space, adds it to what has already been chosen and continues that processuntil a basis is achieved.

Did you find in Activity 8 that you got a different basis each of the threetimes you ran Make Basis? How about the number of elements in each basis?Why does this happen? Could you guarantee that the basis you get containssome given set of one or more vectors? What would you have to assumeabout this set?

Finite dimensional vector spaces. Suppose that the field K is not finite.For example, your vector space might be Rn. Or, as in Activity 12 of Section4.1, it might be the set of all functions from [−π, π] to R whose derivativeof every order and at every point in the interval (left and right derivatives atthe endpoints) exist. In such a case you can still apply the algorithm, butyou can’t be sure of what will happen. That is, you pick a non-zero vectorand put it in the set B. Then you pick a non-zero vector not in the subspacegenerated by B and add that vector to B. You continue this process. If yourvector space is finite, it must stop. If the vector space is not finite, it may ormay not stop. If it stops after finitely many steps, then the resulting set B isa basis and a finite set. This case is important enough to warrant a formaldefinition.

Definition 4.4.2. If a vector space V has a basis which is a finite set, thenV is called a finite dimensional vector space.

As we indicated above, even if this process does not stop, you can stillshow that the vector space has a basis, but we will not discuss that situationhere.


Characterizations of bases. We defined a basis for a vector space in away that is equivalent to being a set which is both linearly independent andgenerates the whole space. (See comment after Definition 4.4.1 and Exercise2.) There are two other characterizations of a basis.

Theorem 4.4.2. A subset B of a vector space V is a basis if and only if itis a maximal linearly independent set. That is, B is linearly independent andif any other vector is added to it, then it is no longer linearly independent.

Proof. Exercise.

Theorem 4.4.3. A subset B of a vector space V is a basis if and only if itis a minimal generating set. That is, the subspace generated by B is all ofV , but if any vector is removed from B, then the subspace it generates is nolonger all of V .

Proof. Exercise.

Dimension

In several activities for this section, you found bases for various vector spaces.In Activity 4, you found bases for (Z2)5 and for (Z3)4. In Activity 5 youconstructed two different bases for P5(Z3) and in Activity 8, you found threebases for the same vector space.

In all of these examples, did you notice any regularities in the numberof elements in a basis? The bases for (Z2)5 and for (Z3)4 had, respectively,5 and 4 elements. Note that for each of these vector spaces, the coordinatebases also have 5 and 4 elements, respectively. In the other examples, whatdid all bases for the same vector space always have in common?

Do you think there is a general result here? There is, but first we mustconsider an important fact about the maximum number of elements in alinearly independent set. In going through the following proof, it might helpyou to pick values for m and n and write out all of the summations.

Theorem 4.4.4. If V has a basis with n elements in it, then any subset ofV with more than n elements must be linearly dependent.

Proof. Suppose that B = {b1,b2, . . . ,bn} is a basis for V , and that C ={c1, c2, . . . , cm} is a subset of V with m > n. We must show that C is a


linearly dependent set. To do that we must find scalars, a1, a2, . . . , am notall zero, such that,

m∑i=1

aici = 0.

Now, because B is a basis, each element of C is equal to some linearcombination of the vectors in B. That is, we have scalars tij, i = 1, . . . ,m, j =1, . . . , n such that for each i = 1, . . . ,m we have,

ci =n∑j=1

tijbj

Substituting these expressions in the equation we have to solve, this equationbecomes,

m∑i=1

ai

n∑j=1

tijbj = 0

orm∑i=1

n∑j=1

aitijbj = 0

and, reversing the order of the two sums (why can we do this?), it becomes:

n∑j=1

m∑i=1

aitijbj = 0

In this vector equation, we can replace 0 by its expression as a linear combi-nation of the basis vectors to obtain:

n∑j=1

m∑i=1

aitijbj =n∑j=1

0bj.

This equation expresses the equality of two linear combinations of the basisvectors and therefore, because of the uniqueness, each coefficient of bj, j =1, . . . , n is the same on both sides of the equation. This leads to the following


system of equations:

m∑i=1

aiti1 = 0

m∑i=1

aiti2 = 0

···

m∑i=1

aitin = 0

But this is a system of n equations in m unknowns, with m > n, so thesystem must have a solution in which not all of the unknowns are 0. Why?This notion will be pursued in Exercise 13.

With this theorem, we can easily prove what you observed in consideringthe number of elements in a basis.

Theorem 4.4.5. Any two bases for a vector space V have the same numberof elements.

Proof. Suppose we have two sets which are bases for V . Applying Theorem4.4.4 to the fact that the first set is a basis and the second set is linearlyindependent, we conclude that the second set cannot have more elementsthan the first. Reversing the two sets, we conclude the the first set cannothave more elements than the second. Hence they have the same number ofelements.

This theorem allows us to make the following definition:

Definition 4.4.3. The dimension of a finite dimensional vector space is thenumber of elements in a basis.

Why do we need Theorem 4.4.5 before we can define the dimension of avector space?

Some of the following theorems were illustrated by examples in the activ-ities. You will have a chance to give general proofs in the exercises.


Theorem 4.4.6. The dimension of the vector space Kn is n.

Proof. Exercise.

Theorem 4.4.7. If V is an n-dimensional vector space, then any set ofvectors which generates V must have at least n elements.

Proof. Exercise.

Theorem 4.4.8. If V is an n-dimensional vector space, then any set of nlinearly independent vectors generates V .

Proof. Exercise.

Theorem 4.4.9. If V is an n-dimensional vector space, then any set of nvectors which generates V must be linearly independent.

Proof. Exercise.

Theorem 4.4.10. If V is an n-dimensional vector space, and B is a set oflinearly independent vectors in V , then there is a basis which contains B, thatis, the set B can be extended into a basis of V .

Proof. Exercise.

Dimensions of Euclidean spaces. The Euclidean space R2 contains el-ements of three types: points, which are subspaces of dimension zero; lines,which are subspaces of dimension one; and the entire space, which is of dimen-sion two. The Euclidean space R3 contains four types of subspaces, points;lines; planes; and the entire space, which is itself a subspace of dimension 3.

We have seen that any vector space of the form V = (K)n, whetherK is equal to the set of real numbers R or not, is a space of dimension npossessing subspaces of smaller dimension. Such a result is consistent withthe geometric notion of dimension discussed in Euclidean space.



Most of the vector spaces you have studied so far are of the form (Zp)n. This

is because these are concrete examples and are the easiest to work with. Butyou have also begun to work with some other examples: Pn(K), the vectorspace of polynomials of degree less than or equal to a certain number n withcoefficients in some field K; the vector space of all solutions of a system ofhomogeneous linear equations; and the vector space of all functions whichsatisfy a certain differential equation. In each of these cases, the space hasbases, and they are important. We will consider some first facts.

The results you found in Activity 5(a) are completely general and youmight have been able to solve this problem more easily by hand withoutthe computer. After all, what is a linear combination of monomials but apolynomial whose degree is less than or equal to the highest degree of amonomial, which is n. Thus you get all polynomials with degree less than orequal to n. Can such a polynomial be equal to the zero polynomial, if thecoefficients are not all zero? That one could be interesting, so you will havea chance to play with it in the exercises.

Theorem 4.4.11. The monomials 1, x, x2, . . . , xn form a basis for Pn(K).

Proof. Exercise.

Determining other bases for Pn(K) involves a lot of work with the prop-erties of polynomials, and we will not go much farther with that in this text.

Closely related to the vector space of polynomials is the vector space ofpolynomial functions. The result of Theorem 4.4.11 does not hold in generalfor these spaces, instead we have the following.

Theorem 4.4.12. The monomial functions x −→ 1, . . . , x −→ xn form abasis of PFn(R).

If n < p, then the monomial functions, then the monomial functionsx −→ 1, . . . , x −→ xn form a basis of PFn(Zp).

If n ≥ p, then the monomial functions x −→ 1, . . . , x −→ xp−1 form abasis of PFn(Zp).


Consider the system of equations in Activity 8, and try to solve it com-pletely, perhaps using the methods you learned in Chapter 3.


You should be able to determine that all solutions [x1, x2, x3] are givenby:

x1 = s

x2 = t

x3 = s+ t

where s, t run independently through all values in Z5.Another way to say this is that the solutions form a subspace of (Z5)3

and that this subspace is generated by the two vectors, 〈1, 0, 1〉,〈0, 1, 1〉. Canyou see why this is so? If it is, then these two vectors are obviously linearlyindependent so they form a basis for the space of solutions of this system.

This situation is very general, and you will study it more in Chapter 6.For now, we can introduce some words you will meet later. This system hasthree equations, and the vector space of solutions is of dimension 2. Hencewe say that the rank of the system is 1, and its nullity is 2.

Here is something else for you to mull over. Suppose you throw away allof the x′s and the = 0 parts of the system of equations in Activity 8, leavingyou with a 3× 3 matrix. Treat the rows of this matrix as vectors and noticethat the three vectors do not form a linearly independent set. Moreover, thelargest subset which is linearly independent has only 1 vector. Is this 1 acoincidence? Now do the same thing with the columns of the matrix. Arethe results the same? What’s that all about?

There is a deep mathematical connection between linear algebra and thesolutions of a linear differential equation that involves a lot of importantmathematics. In this book, we can only give a barest hint of the tip of aniceberg.

Recall, from Section 4.3, the differential equation

f ′′ + f = 0

where f is an unknown function in C∞(R).You checked in Section 4.3 that the two functions sin, cos are solutions

to this differential equation. You should have no trouble showing that theyform a linearly independent set in the vector space C∞(R). Do they generatethe subspace of all solutions?

In the theory of differential equations, the study of initial value problemsshows that if you choose any real numbers a, b then there is a unique function


f ∈ C∞(R) which is a solution to the differential equation and satisfies,f(0) = a and f ′(0) = b. On the other hand, since the value of the sinfunction and its derivate at 0 are 0, 1 respectively and the value of the cosfunction and its derivative at 0 are 1, 0 respectively, given any a, b, we canfind scalars s, t such that the function s sin +t cos has its value at 0 and itsderivative at 0 equal to a, b respectively.

So take any function f ∈ C∞(R) and let a = f(0), b = f ′(0). Then finds, t as above. We have the fact that both f and s sin +t cos are solutions tothe differential equation and they have the same values and derivatives at 0.By the uniqueness, it follows that

f = s sin +t cos

Hence {sin, cos} generates the space of solutions o the differential equation,so it is a basis. Incidentally, can you explain why we use sin, cos here andnot sin(x), cos(x)?

You might think that some of the statements just made require proofs.You will have an opportunity to provide them in the exercises.

Exercises

1. Write out each of the following expressions or equations without useof summation notation. Then explain why the equations in (b)–(e) aretrue.

(a)∑4

i=1 aibi

(b)∑4

i=1 tbi = t∑4

i=1 bi

(c)∑4

i=1 (ai + bi) =∑4

i=1 ai +∑4

i=1 bi

(d)∑

i,j aij =∑3

i=1

∑4j=1 aij

(e)∑3

i=1

∑4j=1 aij =

∑4j=1

∑3i=1 aij

2. Show that a subset of a vector space V is a basis if and only if it islinearly independent and generates all of V .

3. Show that each of the following sets is a basis for (K)n.

(a) B2 = {〈1, 0, 0, . . . , 0〉 , 〈0, 1, 0, . . . , 0〉 , 〈0, 0, 1, . . . , 0〉 , . . . ,〈0, 0, 0, . . . , 0, 1〉}


(b) B3 = {〈1, 1, 1, . . . , 1〉 , 〈1, 1, 1, . . . , 0〉 , 〈1, 1, 1, . . . , 0, 0〉 , . . . ,〈1, 1, 0, . . . , 0〉 , 〈1, 0, 0, . . . , 0〉}

4. In the paragraph on Representations of a vector space as (K)n it isstated that “It is easy to see that the operations of vector addition andmultiplication are preserved by this correspondence. ”

(a) Explain what is meant by “preserved”.

(b) Prove that vector addition is preserved.

(c) Prove that scalar multiplication is preserved.

5. In Activity 3(a) choose the set which is a basis and find the coordinatesof the expansion of the vector 〈1, 0, 1, 0, 1〉 with respect to this basis.

6. In Activity 3(b) choose the set which is a basis and find the coordinatesof the expansion of the vector 〈0, 1, 2, 0〉 with respect to this basis.

7. Let P4(R) be the vector space of all polynomials of degree less than orequal to 4 with real coefficients.

(a) Find a basis which contains the polynomial x+ 1

(b) Find a basis which contains the polynomials x+ 2, x− 1

(c) Find a basis which contains the polynomials x− 2, x2 − 2

8. For each of the bases you found in the previous exercise, find the ex-pansion, with respect to that basis, of the polynomial x.

9. Let φ be the set of all finite sequences of real numbers.

(a) Define a scalar multiplication and a vector addition on this setand show that with these operations it becomes a vector space.

(b) Explain what could be meant by the “coordinate basis” for thisvector space and show that it is a basis.

(c) Find a basis B for φ in which no sequence contains a zero.

(d) Consider the element of φ which is sequence consisting of three0’s followed by three 1’s. Find the expansion of this vector withrespect to your basis B.



11. Prove Theorem 4.4.3

12. Write out the proof of Theorem 4.4.4 for the case n = 3,m = 5 usingno summation symbols.

13. (a) Consider the following homogeneous system of 3 equations in 4unknowns

x1 − x2 + 2x3 + 4x4 = 0

2x1 + 3x2 − x3 + x4 = 0

−4x1 + 5x2 + 3x3 − 2x4 = 0.

Show there exists a non-trivial solution to the system.

(b) Generalize the results of the previous part to show that a systemof n equations in m unknowns, with m > n, must have a solutionin which not all of the unknowns are 0. This completes the proofof Theorem 4.4.4.






19. (a) Choose several values for n and fields K and in each case showthat any polynomial in Pn(K), whose coefficients are not all zero,cannot be the zero polynomial.

(b) Show in general for any n,K that any polynomial in Pn(K), whosecoefficients are not all zero, cannot be the zero polynomial.



22. Show the set {〈1, 0, 1〉 , 〈0, 1, 1〉} is a basis for the solution space of thesystem of equations in Activity 8.


23. Show that the set {sin, cos} is linearly independent in the vector spaceC∞(R).

24. The set {sin, cos} also spans C∞(R) (and so, with the previous exercise,it is a basis). To show that it generates all of C∞(R) requires somebackground in differential equations. Look this up and sketch a proofthat {sin, cos} spans C∞(R).

Chapter 5

Linear Transformations

Remember the definition of a function fromprevious mathematics courses? In calculus,functions are the main object of study asdifferentiation and integration both operate onfunctions. You are probably saying to yourself,“Of course, a function is a mapping from someset called the domain into another set called theco-domain in which any element from thedomain is mapped to exactly one element in theco-domain.” Or something like “a subset f ofthe Cartesian product of A and B such that forevery a ∈ A there is exactly one b ∈ B such that(a, b) ∈ f .” This chapter is going to explore somefunctions from one vector space to another andconsider how portions of the domains and rangesmight be thought of a vector spaces themselves.

218

5.1 Introduction to Linear Transformations

Activities

1. Let U = (Z3)2, the vector space of ordered pairs of elements in Z3.Let u,v ∈ U , where u = 〈u1, u2〉 and v = 〈v1, v2〉. Define a functionT : U −→ U by

T (u) = 〈(u1 · u2), (2 · u2)〉 .

For all u,v ∈ U and for all c, d ∈ Z3, perform the following steps:

(a) Compute cu + dv, and then find T (cu + dv).

(b) Compute T (u) and T (v), and then find cT (u) + dT (v).

(c) Determine whether T (cu + dv) = cT (u) + dT (v).

2. Let U = (Z3)2 and V = (Z3)3. Let u,v ∈ U , where u = 〈u1, u2〉 andv = 〈v1, v2〉. Define a function F : U −→ U by

F (u) = 〈(u1 + u2), (2 · u1 + u2), u1〉 .

For all u,v ∈ U and for all c, d ∈ Z3, perform the following steps:

(a) Compute cu + dv, and then find F (cu + dv).

(b) Compute F (u) and F (v), and then find cF (u) + dF (v).

(c) Determine whether F (cu + dv) = cF (u) + dF (v).

3. Let U and V be vector spaces with scalars in K, and assume thatname vector space has been run. Write an ISETL func is linear

that accepts a func H : U −→ V , where U and V indicate vectorspaces over K; checks the equality

H(cu + dv) = cH(u) + dH(v)

for all pairs of scalars c, d ∈ K and all pairs of vectors u,v ∈ U ; andreturns true, if the equality being checked holds for all possible scalarand vector pairs, or false, if the equality does not hold. Apply thisfunc to the funcs T and F defined in the Activities 1 and 2.

5.1 Introduction to Linear Transformations 219

4. Let U = (Z3). Let u, v ∈ U . Define a function H : U −→ U by

H(u) = u2.

(a) Apply the func is linear to H. Does is linear return true orfalse in this case?

(b) If we define G : U −→ U by

G(u) = un,

where n is an integer greater than or equal to 1, for what valuesof n will is linear return true?

5. Let U = (Z5)2. Run name vector space, and complete parts (a)–(e).

(a) Write a func T that accepts a vector 〈u1, u2〉 ∈ (Z5)2 and returnsa vector in (Z5)2. The first component of the output is the sumof the product of u1 by 2 with the product of u2 by 4, and thesecond component is the sum of u1 with the product of 3 and u2.

(b) Write a func F that accepts a vector 〈u1, u2〉 ∈ (Z5)2 and returnsa vector given by 〈u1 + 2, u2〉 ∈ (Z5)2.

(c) Write a func H that accepts a vector 〈u1, u2〉 ∈ (Z5)2 and returnsa vector given by 〈0, u2〉 ∈ (Z5)2.

(d) Write a func R that accepts a vector 〈u1, u2〉 ∈ (Z5)2 and returnsa vector given by 〈3u1 + 2u2 + 2, 2u1 + u2 + 3〉 ∈ (Z5)3.

(e) Write a func S that accepts a vector 〈u1, u2〉 ∈ (Z5)2 and returnsa vector given by 〈3u1 + 2u2, 2u1 + u2〉 ∈ (Z5)2.

(f) Apply the func is linear to each of the funcs you have con-structed in (a)–(e). Which return true? Which return false?

6. Let U = R2 be the coordinate plane, the vector space of ordered pairswith real-valued components. Let u = 〈2, 5〉 and v = 〈−1, 3〉 be twovectors in the plane. Define G : R2 −→ R2 to be the function thataccepts a vector u ∈ R2 and returns the vector found by rotating u

counterclockwise throughπ

6radians. If we think of u geometrically as

an arrow that emanates from the origin, then a rotation through θ ra-dians refers to rotating the given arrow θ radians in a counterclockwisedirection.

220 CHAPTER 5. LINEAR TRANSFORMATIONS

(a) Use the ISETL tool vectors to graph G(u) +G(v) and G(u + v).What do you observe about the relationship between G(u)+G(v)and G(u + v)?

(b) Let c = 2. Use vectors to graph G(cu) and cG(u). What do youobserve about the relationship between G(cu) and cG(u)?

(c) For u,v ∈ R2 and c, d ∈ R, does G satisfy the equality

G(cu + dv) = cG(u) + dG(v)?

Explain your answer.

7. Let U = R2 be the coordinate plane. Let u = 〈3, 1〉 and v = 〈−1, 3〉 betwo vectors in the plane. Define H : R2 −→ R2 to be the function thataccepts a vector u ∈ R2 and returns the vector found by reflecting uthrough the line whose equation is given by y =

√3x. If we think of u

geometrically as an arrow emanating from the origin, then a reflectionof u refers to finding the mirror image of its arrow with respect to thegiven reflecting line.

(a) Use vectors to graph H(u) + H(v) and H(u + v). What doyou observe about the relationship between H(u) + H(v) andH(u + v)?

(b) Let c = 2. Use vectors to graph H(cu) and cH(u). What doyou observe about the relationship between H(cu) and cH(u)?

(c) For u,v ∈ R2 and c, d ∈ R, does H satisfy the equality

H(cu + dv) = cH(u) + dH(v)?


8. Let U = R2 be the coordinate plane. Let u = 〈3, 2〉 and v = 〈1, 5〉 betwo vectors in the plane. Define S : R2 −→ R2 to be the function thataccepts a vector u ∈ R2 and that returns the vector found by translat-ing the vector u by the vector 〈3, 4〉. If we think of u geometrically asan arrow that emanates from the origin, then a translation of u refersto “moving” u to a new location in the plane without disturbing itsoriginal direction.


(a) Use vectors to graph S(u) + S(v) and S(u + v). What do youobserve about the relationship between S(u)+S(v) and S(u+v)?

(b) Let c = 2. Use vectors to graph S(cu) and cS(u). What do youobserve about the relationship between S(cu) and cS(u)?

(c) For u,v ∈ R2 and c, d ∈ R, does S satisfy the equality

S(cu + dv) = cS(u) + dS(v)?


9. Let D : C∞(R) −→ C∞(R) be the operator defined by D(f) = f ′, thederivative of f . Let D2 be the operator defined by D ◦D. Determinewhether D2 : C∞(R) −→ C∞(R) satisfies the condition:

D2(af + bg) = aD2(f) + bD2(g), f, g ∈ C∞(R), a, b ∈ R.

10. Let∫ 1

0: PF3(R) −→ R be defined by∫ 1

0

p =

∫ 1

0

p(t) dt.

Determine whether∫ 1

0satisfies the condition∫ 1

0

(af + bg) = a

∫ 1

0

f + b

∫ 1

0

g, f, g ∈ PF3(R), a, b ∈ R.

11. Let J : PF2(R) −→ PF3(R) be defined by

J(p) = x −→∫ x

2

p(t) dt.

Determine whether J satisfies the condition

J(af + bg) = aJ(f) + bJ(g), f, g ∈ PF2(R), a, b ∈ R.


Discussion

Functions between Vector Spaces

In calculus, you worked with functions whose domains and ranges were bothsome set of the real numbers: each input was a real number and each outputwas a real number. In multivariable calculus, you broadened your horizonsa bit. A function of two variables accepts an ordered pair (x, y) of numbersand returns a real number. A function of three variables accepts an orderedtriple (x, y, z) and returns a real number. With vector-valued functions,that is, functions whose ranges are vector spaces, the situation is reversed:each input is a real number, and each output can be an ordered pair or anordered triple. For example, a function h : R −→ R3 defined by h(x) =〈x2, 2x+ 3, 3x3〉 accepts a real number x and returns the vector, or orderedtriple 〈x2, 2x+ 3, 3x3〉.

In linear algebra, you have the opportunity to expand your view evenfurther. Look again at the functions T and F defined in the first two ac-tivities. What are the inputs and outputs of these functions? You mighthave noticed that both accept a vector u in (Z5)2, while T returns a vectorin (Z5)2, and F returns a vector in (Z3)3. A function between vector spacesassigns one and only one output vector, say v in V , to each input vector,say u in U . Such functions are often called “vector transformations” or just“transformations”. Check all the functions you worked with in the activities:Which of these are vector transformations? In Activity 1, you were askedwhether T (cu + dv) = cT (u) + dT (v). Part (b) of Activity 2 asked you asimilar question regarding the function F . In Activity 3, you were asked towrite a func that would check this condition for any vector space functionH : U −→ V . In particular, given any H : U −→ V , any pair of vectorsu,v ∈ U , and any pair of scalars c, d, does the following equality hold:

H(cu + dv) = cH(u) + dH(v)?

Any vector space function that satisfies this condition is called a linear trans-formation. Is the function T defined in Activity 1 a linear transformation?Is the function F defined in Activity 2 a linear transformation?


Definition and Significance of Linear Transformations

The single condition you used in Activity 3 to construct the func is linear

can be separated into two conditions, as presented in the definition givenbelow.

Definition 5.1.1. Let U and V be vector spaces with scalars in K. Afunction T : U −→ V is a linear transformation if

(i.) T (u + v) = T (u) + T (v) for u,v ∈ U and

(ii.) T (cu) = cT (u) for u ∈ U and c ∈ K.

How could the func you defined in Activity 3 be modified to check thetwo conditions given in the definition? Of those funcs in Activities 1, 2, 4,and 5 that are not are linear, which fail condition (i)? condition (ii)? both?

In Activity 4, we defined a familiar function using vector notation. Youcan verify that the set of real numbers R is in fact a vector space over itself,that is, the set of scalars is also R. The function H defined in this activity,like any function from R to R, is a vector transformation: a real number,say x ∈ R, thought of in this context as a vector, is assigned to the vector,or real number, x2. When you applied is linear, what did you find? Is H alinear transformation? For what n is the function G defined in Activity 4(b)a linear transformation?

Many of the functions you studied in calculus are not linear. However,linear transformations are extremely important in calculus. For example,when you compute the derivative D of a differentiable function g : R −→ Rat a point x = a, the function G : R −→ R given by

G(x) = D(g)(a) · (x− a),

where (·) denotes real number multiplication, is a linear transformation thatapproximates g near x = a. This is also true of multivariable functionsinvolving the gradient. For a function of two variables h : R2 −→ R, thefunction H : R2 −→ R given by

H(x, y) = 〈Dx(h)(a, b), Dy(h)(a, b)〉 · 〈(x− a), (y − b)〉 ,

where Dx denotes the partial derivative function with respect to x, Dy repre-sents the partial derivative function with respect to y, and (·) represents the


dot product, is a linear transformation that approximates h near the point(a, b). Can you show that H is linear?

In Activity 9, you were asked to determine whether the differential oper-ator D2 is linear. What did you find? Is your answer consistent with whathas just been discussed regarding the functions G and H?

In Activities 10 and 11, you considered the definite integral applied topolynomials from PF2(R) and PF3(R). Do the definite integrals defined inthese activities satisfy the conditions given in Definition 5.1.1? If not, whatmodifications would need to be made to ensure linearity?

Why is linearity so important? The two conditions specified in Defini-tion 5.1.1 ensure “preservation” of vector addition and scalar multiplicationfor a function between two vector spaces. Suppose that T : U −→ V is alinear transformation. The first condition given in Definition 5.1.1,

T (u + v) = T (u) + T (v), u,v ∈ U,

guarantees that the vector assigned to u + v under T is equal to the sumassigned by T to u and v individually. This is illustrated in the diagrambelow: the sum of the outputs, T (u) and T (v), is equal to the output of thesum, T (u + v).

(u,v) T //

+

��

(T (u), T (v))

+

��u + v T // T (u) + T (v)

Thus, T and + can be applied in any order: taking the sum of u and vfollowed by applying T yields the same answer as taking the sum of theimages of u and v separately.

The second condition given in Definition 5.1.1,

T (cu) = cT (u), u ∈ U, c ∈ K,

guarantees preservation of scalar multiplication: the vector assigned by T tothe scalar product cu is equal to the product of c with the vector assignedby T to u.

u T //

c·��

T (u)

c·��

cu T // cT (u)


Similar to the case involving the sum, T and · can be applied in any order:taking the product cu followed by applying T yields the same answer asmultiplying the image of u by the scalar c.

In Activities 6, 7, and 8, you were asked whether familiar geometric trans-formations such as rotations, reflections, and translations preserve the opera-tions of vector addition and scalar multiplication in R2. Activity 6 described

a rotation throughπ

6radians. The general definition and its analytic, or

algebraic, representation are given in the following definition.

Definition 5.1.2. A rotation is a function that takes a vector in R2 androtates it in a counterclockwise fashion through an angle of φ radians. It isgiven by the formula

T (〈x, y〉) = 〈x cosφ− y sinφ, x sinφ+ y cosφ〉 .

y

x

θ

φ

(x,y)(x',y')

Figure 5.1: A rotation

As the figure indicates, let 〈x, y〉 ∈ R2 be a vector in the plane, andassume that the vector 〈x, y〉 forms an angle of θ radians with the x-axis.Let 〈x′, y′〉 be the vector returned after 〈x, y〉 has been rotated through φradians.

If we drop a perpendicular segment from the point (x, y) to the x-axis,we can see that the segment from the origin to the point (x, 0) has lengthx, the segment from (x, 0) to (x, y) has length y, and the vector 〈x, y〉 has


length√x2 + y2. We can show that

x =√x2 + y2 cos θ

y =√x2 + y2 sin θ.

Similarly, if we drop a perpendicular segment from (x′, y′) to the x-axis, wecan see that

x′ =√x2 + y2 cos(θ + φ)

y′ =√x2 + y2 sin(θ + φ).

If we then apply standard trigonometric identities, we can see that the ge-ometric representation is the same as the analytic representation, which isgiven by the expression for T .

In Activity 7, you were asked to consider a reflection through the liney =

√3x. As with a rotation, there is a general analytic, or algebraic,

representation.

Definition 5.1.3. A reflection through the line y = mx is a function thattakes a vector in R2 and returns a vector given by the formula

T (〈x, y〉) = 〈x cos 2φ+ y sin 2φ, x sin 2φ− y cos 2φ〉 ,

where the angle of inclination of the reflecting line is given by φ = tan−1 m.

Let 〈x, y〉 ∈ R2 be a vector in the plane, and assume that the vector 〈x, y〉forms an angle of θ radians with the x-axis, as shown in the figure above.Assume that the line y = mx forms an angle of φ radians with the x-axis,where m = tan θ. Let 〈x′, y′〉 represent the vector returned after 〈x, y〉 hasbeen reflected through y = mx.

The drawing indicates that the reflection through y = mx is the same asfirst reflecting the vector 〈x, y〉 through the x-axis and then rotating the re-sulting reflection 〈x,−y〉 through 2φ radians. The components of 〈x′, y′〉 canbe found by applying the expression for a rotation given in Definition 5.1.2.Can you explain why a reflection can be represented as a reflection throughan axis followed by a rotation?

The transformation described in Activity 8 is an example of a translation,which is defined below.


y

x

θ

2φ

(x,y)

(x',y')

(x,-y)

y = mx

φ

Figure 5.2: Reflection through a line

y

x

(x,y)

(x',y')

(c,d)

cd

Figure 5.3: A translation


Definition 5.1.4. A translation by the vector 〈c, d〉 is a function T : R2 −→R2 that takes a vector in R2 and returns a vector given by the formula

T (〈x, y〉) = 〈x+ c, y + d〉 .

Let 〈x, y〉 ∈ R2 be a vector in the plane, and assume that 〈x′, y′〉 repre-sents the vector returned after 〈x, y〉 has been translated by the vector 〈c, d〉.As the figure illustrates, the components of 〈x′, y′〉 are the vector sum of〈x, y〉 and 〈c, d〉.

Based upon your work in the Activities, which of these geometric transfor-mations is a linear transformation? For each one that is not, can you identifywhich operation, vector addition or scalar multiplication, is not “preserved”?

In addition to preservation of vector addition and scalar multiplication,linear transformations preserve lines. A line through the vector a in thedirection of v is given by the set

{tv + a : t ∈ R}.

v

tv

a

tv + a

Figure 5.4: Vector form of a line

As shown in Figure 5.4 in R2, the direction vector v emanates throughthe origin, a is the vector through which the line passes, and every point onthe line can be represented as the vector sum of a and some scalar multipleof v.

Linear transformations transform lines into lines. For example, the trans-formation T : R3 −→ R3 given by

T (〈u1, u2, u3〉) = 〈u1 + u2, u2 − u3, u1 + u3〉


transforms the line {t 〈2, 1,−1〉+ 〈−1, 3, 1〉 : t ∈ R} into the line {t 〈3, 2, 1〉+〈2, 2, 0〉 : t ∈ R} as shown in Figure5.5.

a

v

tv + atv

Figure 5.5: Linear Transformation of a line

The example being considered here can be generalized.

Theorem 5.1.1. Let T : Rn −→ Rn be a linear transformation, and let l bea line in Rn given by

l = {tv + a : t ∈ R}.Then, the image of l under T is also a line.

Proof. We must show that T (l) is a line. Since T is assumed to be a lineartransformation, we can write

T (l) = T (tv + a)

= T (tv) + T (a)

= tT (v) + T (a).

Since T is a vector transformation from Rn to Rn, T (v) ∈ Rn and T (a) ∈ Rn.Therefore, {tT (v) + T (a) : t ∈ R} is a line; that is, T transforms the linepassing through the vector in the direction of v into the line passing throughthe vector T (a) in the direction of T (v).

In the exercises, you will be asked to show related consequences of linear-ity. Specifically, linear transformations transform parallel lines into parallellines, line segments into line segments, and squares into parallelograms.


Component Functions and Linear Transformations

Many of the vector spaces you have studied in this course involve spaces oftuples, that is, a vector space Kn, where K is the set of real numbers R orsome finite field such as Z5. Later in this text, we will show that non-tuple,finite dimensional vector spaces are structurally equivalent to spaces of tuplesof the same dimension. Hence, any insight regarding linear transformationsbetween spaces of tuples is of particular importance to us.

As you may have noticed in the activities, any function H : Kn −→ Km

can be decomposed into a set of component functions. For example, thefunction T : (Z5)2 −→ (Z5)2, given in Activity 1 and defined by T (〈a, b〉) =〈(a · b), (2 · b)〉, can be expressed in the form

T (〈a, b〉) = 〈t1(〈a, b〉), t2(〈a, b〉)〉 ,

where the function t1 : (Z5)2 −→ Z5, defined by

t1(〈a, b〉) = (a · b),

corresponds to the expression in the first component, and the function t2 :(Z5)2 −→ Z5, defined by

t2(〈a, b〉) = (3 · b),

corresponds to the expression given in the second component. Can you pro-vide similar descriptions for the func F defined in Activity 2 and the funcsdescribed in Activity 5?

Consider the transformation defined in Activity 2. In this example, youmay have noticed that the expression for each component function is a linearcombination of the components of the input vector. Does this characteristicappear to hold true for those funcs in Activity 5 that you deemed to belinear? Is this true for the func defined Activity 1, a transformation whichyou discovered was not linear? What about the non-linear funcs defined inActivity 5: is each component a linear combination of the components of theinput vector, or is there at least one component for which this fails?

As the next theorem illustrates, the patterns you discovered in the activ-ities can be generalized.

Theorem 5.1.2. A function T : Kn −→ Km given by

T (u) = 〈f1(u), f2(u), . . . , fm(u)〉


is a linear transformation if and only if each component function

fi : Kn −→ K, i = 1, 2, . . . ,m,

is given by

f1(u) = f1(〈u1, u2, · · · , un〉) = a11u1 + a12u2 + · · ·+ a1nun

f2(u) = f2(〈u1, u2, · · · , un〉) = a21u1 + a22u2 + · · ·+ a2nun...

fm(u) = fm(〈u1, u2, · · · , un〉) = am1u1 + am2u2 + · · ·+ amnun,

where each aij, i = 1, 2, . . . ,m, j = 1, 2, . . . , n, is a scalar.

Proof. (⇐=:) Let u = 〈u1, u2, · · · , un〉 ∈ Kn. Then,

T (u) = T (〈u1, u2, · · · , un〉) =⟨f1(〈u1, u2, · · · , un〉), f2(〈u1, u2, · · · , un〉),

· · · , fm(〈u1, u2, · · · , um〉)⟩,

where we assume that each component function is

f1(u) = f1(〈u1, u2, · · · , un〉) = a11u1 + a12u2 + · · ·+ a1nun

f2(u) = f2(〈u1, u2, · · · , un〉) = a21u1 + a22u2 + · · ·+ a2nun...

fm(u) = fm(〈u1, u2, · · · , un〉) = am1u1 + am2u2 + · · ·+ amnun,

and each aij ∈ K, i ∈ {1, . . . ,m}, j ∈ {1, . . . , n}, is a scalar. To establishthat T is a linear transformation, it suffices to show that each componentfunction fi : Kn −→ K, i = 1, 2, . . . ,m is linear. The details are left as anexercise. See Exercise 18.

(=⇒:) Assume that T is a linear transformation. Let u = 〈u1, u2, . . . , un〉 ∈Kn, and rewrite u as the sum

u = 〈u1, u2, . . . , un〉= 〈u1, 0, . . . , 0〉+ 〈0, u2, 0, . . . , 0〉+ · · ·+ 〈0, . . . , 0, un〉 .


Since we are assuming that T is a linear transformation,

T (u) = T (〈u1, 0, . . . , 0〉) + T (〈0, u2, 0, . . . , 0〉) + · · ·+ T (〈0, . . . , 0, un〉).

Since each vector 〈0, . . . , 0, uj, 0, . . . , 0〉 has a single nonzero component, eachcomponent function fi : Kn −→ K, i = 1, . . . ,m, behaves like a single-variable function that accepts uj and returns a scalar; specifically,

fi(〈0, . . . , 0, uj, 0, . . . , 0〉) = aijuj,

where aij ∈ K. If we take the sum over all j, we achieve the desired resultfor each component function fi. The details are left to the exercises. SeeExercise 19.

In the second part of the proof, (=⇒), there is an assumption beingmade. Can you identify what that assumption is? Can you state and provea theorem that would address this assumption?


Throughout this section, we have considered transformations between vectorspaces of tuples. In other chapters, you have been introduced to other, non-tuple examples. Although these examples are familiar, you have been askedto think about them in a new context. For instance, consider the set C∞(R)of all infinitely differentiable functions from R to R. This is a vector spacewith scalars in R. Do you remember what each vector in this space lookslike? Do you recall how the addition and scalar multiplication operations aredefined?

In Activity 9, you were asked to determine whether the second deriva-tive operator is linear. What did you observe? If we define a functionF : C∞(R) −→ C∞(R) by

F (f) = D2(f) + f,

where f ∈ C∞(R), can we say that F is a linear transformation on C∞(R)?If so, can you prove that F is linear? If not, can you explain which conditionof linearity is being violated?

Another class of non-tuple vector spaces are sets of polynomial functionsorganized by degree. For instance,

PF3(R) = {x −→ a0 + a1x+ a2x2 + a3x

3 : a0, a1, a2, a3 ∈ R}


the set of all polynomial functions of degree three or less with real-valuedcoefficients is a vector space. What did you find in Activities 10 and 11? Isthe definite integral defined in this activity linear? What about the functionJ?

Several of the exercises will ask you to make similar determinations. Inparticular, when given a function between two non-tuple vector spaces, howdoes one determine whether the given function satisfies the two conditionsof linearity specified in Definition 5.1.1?

Exercises

1. Explain why each of the following functions from R2 to R2 is not alinear transformation.

(a) T1(〈x, y〉) = 〈3y + 1, x+ y〉(b) T2(〈x, y〉) = 〈xy, 3x+ 4y〉

(c) T3(〈x, y〉) =⟨y

4, 2y + x+ 4

⟩(d) T4(〈x, y〉) = 〈−3x+ 2y − 4, 5x− y + 7〉

2. Define T : R2 −→ R by T (〈x, y〉) = xy. Determine whether T is alinear transformation. If the function is linear, use the definition toprove it; if the function is not linear, explain why the definition fails.

3. Define T : R3 −→ R5 by

T (〈u1, u2, u3〉) =

〈(3u1 + 2u2 − u3), (u1 − 2u2 − u3), 0, (u2 + 5u3), (−3u1 + 3u2)〉 .

Use the definition to show that T is a linear transformation.

4. Define T : Rn −→ R byT (u) = a · u,

where a = 〈a1, a2, . . . , an〉 ∈ Rn, u is any vector in Rn, and (·) repre-sents the dot product on Rn. Use the definition to show that T is alinear transformation. What is the significance of T as it relates to thetopic of the dot product in multivariable calculus?

5. Determine whether each function given below is linear.


(a) D : P3(Z5) −→ P2(Z5) defined by

D(a3x3 + a2x

2 + a1x+ a0) = 3a3x2 + 2a2x+ a1.

(b) T : P3(R) −→ R defined by

T (a3x3 + a2x

2 + a1x+ a0) = a3a2 + a0.

(c) G : P3(R) −→ R defined by

G(a3x3 + a2x

2 + a1x+ a0) = 2a3 − a2 + a1.

(d) H : P3(Z5) −→ P3(Z5) defined by

H(a3x3 + a2x

2 + a1x+ a0) = a3x3 + a2x

2 + a1x+ a0 + 3.

(e) S : P3(R) −→ P3(R) defined by

S(a3x3 +a2x

2 +a1x+a0) = a3(x+2)3 +a2(x+2)2 +a1(x+2)+a0.

6. Determine whether each function given below is linear.

(a) D : PF3(R) −→ PF2(R) defined by

D(p) = p′,

where p′ is the derivative of p.

(b) T : PF3(R) −→ R defined by

T (p) =

∫ 1

0

p(t) dt

(c) J : PF3(R) −→ PF4(R) defined by

J(p) = x −→∫ x

2

p(t) dt

7. Let V be the vector space of all functions from R −→ R whose domainis all of R. Define T : C∞(R) −→ V be defined by

T (f) = F

where for F is a function such that D(F ) = f and F (0) = k 6= 0. Usethe definition to show that T is not a linear transformation. In orderfor T to be a linear transformation, we would have to limit ourselvesto a certain subspace of C∞(R). What subspace would that be?


8. Let g : R −→ R be a differentiable function. Let a ∈ R. DefineG : R −→ R be defined by

G(x) = D(g)(a) · (x− a),

where D denotes the derivative function. Show that G is a lineartransformation.

9. Let C∞(R) be the vector space of all functions that are infinitely differ-entiable. Let D : C∞(R) −→ C∞(R) be the derivative function, thatis, D(g) = g′(x) for g ∈ C∞(R) and x ∈ R. Define P : C∞(R) −→ Rby

P (g) = D(g)

(1

2

),

where g ∈ C∞(R), and P represents the derivative of g evaluated at thepoint 1

2. Use the definition to show that P is a linear transformation.

10. Define f : R3 −→ R by

f(x1, x2, x3) = x1 + x22 + x2

1x3.

Define F : R3 −→ R by

F (x1, x2, x3) = 〈Dx1(f)(2, 1, 1), Dx2(f)(2, 1, 1),

Dx3(2, 1, 1)〉 · 〈x1 − 2, x2 − 1, x3 − 1〉 ,

where Dxi , i = 1, 2, 3, denotes the partial derivative function withrespect to the variable xi, and (·) denotes the dot product. Show thatF is a linear transformation.

11. Let f : Rn −→ R by a function whose partial derivatives exist. Let(a1, a2, . . . , an) ∈ Rn. Define F : Rn −→ R by

F (x1, x2, . . . , xn) = 〈Dx1(a1, . . . , an), Dx2(a1, . . . , an), . . . ,

Dxn(a1, . . . , an)〉 · 〈x1 − a1, x2 − a2, . . . , xn − an〉 ,

where Dxi , i = 1, 2, . . . , n, denotes the partial derivative function withrespect to the variable xi, and(·) denotes the dot product. Show thatF is a linear transformation.


12. Use the definition of a linear transformation to prove that a rotation isa linear transformation.

13. Use the definition of a linear transformation to prove that a reflectionthrough a line passing through the origin is a linear transformation.

14. Use the definition of a linear transformation to prove that a translationis a not a linear transformation.

15. A line segment passing through the vector a in the direction v is givenby the set {tv + a : t ∈ [0, 1]}.

(a) Define T : R2 −→ R2 by

T (〈u1, u2〉) = 〈2u1 + 3u2,−u1 + u2〉 .

Show symbolically and geometrically that the line given by

{t 〈1, 2〉+ 〈−3, 4〉 : t ∈ [0, 1]}

is transformed into a line segment. Find the direction vector vand the vector a through which the segment passes.

(b) Let T : Kn −→ Kn be a linear transformation. Show that theimage of a line segment given by {tv + a : t ∈ [0, 1]} is also a linesegment.

16. Consider the line l in R3 given by l = {t 〈2,−1, 3〉+ 〈1, 0,−2〉 : t ∈ R}.

(a) Find the form of the line l′ that passes through the vector 〈0,−1, 1〉and is parallel to l.

(b) Define T : R2 −→ R2 by

T (〈u1, u2, u3〉) = 〈2u1 + 3u2,−u1 + u2, u2 + u3〉 .

Show symbolically and geometrically that T (l) and T (l′) are par-allel.

(c) Let T : Kn −→ Kn be a linear transformation. Let l be a linepassing through al in the direction v. Let l′ be a line parallel to lthat passes through the vector al′ . Show that T (l) and T (l′) areparallel.


17. Let T : Kn −→ Kn be a linear transformation. Let l be a line inKn that passes through the origin. Show that T (l) passes through theorigin.

18. Complete the proof of the first part (⇐=) of Theorem 5.1.2.

19. Complete the proof of the second part (=⇒) of Theorem 5.1.2. Whatassumption is being made in the second part of the proof, (⇐=)? Statethis assumption as a theorem, and provide a proof.

20. Let T : U −→ V be a linear transformation, where U and V are vectorspaces with scalars in K. Use induction to show that

T (cu1 + du2) = cT (u1) + dT (u2), u1,u2 ∈ U, c, d ∈ K,

holds for any size combination; in particular, show that

T (a1v1 + a2v2 + · · ·+ anvn) = a1T (v1) + a2T (v2) + · · ·+ anT (vn),

where u1, . . . ,un ∈ U , a1, . . . , an ∈ K, and n ≥ 1.

238

5.2 Kernel and Range

Activities

1. Use UKn, defined in Activity 5 in Section 4.1, and the func LC, whichyou constructed in Activity 5 of Section 4.2, to construct the set SOL

of all 5-tuples that satisfy simultaneously each of the four equations inZ5 listed below.

x1 + x2 + x5 = 0x2 + x3 + x5 = 0

x3 + x4 + x5 = 0x1 + x4 + x5 = 0

2. Let T : (Z5)5 −→ (Z5)4 be a linear transformation defined as follows:

The first component of the output is the sum of the first, second,and last components of the input.

The second component of the output is the sum of the second,third, and last components of the input.

The third component of the output is the sum of the third, fourth,and last components of the input.

The last component of the output is the sum of the first, fourth,and last components of the input.

(a) Write an ISETL statement that constructs the set called KER of allvectors u in (Z5)5 such that T (u) = 0.

(b) Is KER a subspace of (Z5)5? Use is subspace to

determine this.

3. Complete parts (a)–(d) as a means of determining the relationship be-tween the sets KER and SOL.

(a) Determine whether SOL is a subset of KER. Use the ISETL codeSOL Subset KER to make this determination.

(b) Determine whether KER is a subset of SOL. Use the ISETL codeKER Subset SOL to make this determination.

5.2 Kernel and Range 239

(c) Based only upon your findings in (a) and (b), which of the follow-ing appears to be true:

• Every element of SOL is also an element of KER, but there isat least one element of KER that is not an element of SOL.

• Every element of KER is also an element of SOL, but there isat least one element of SOL that is not an element of KER.

• KER is neither a subset of SOL, nor is SOL asubset of KER.

• The sets SOL and KER are equal.

4. Find a basis for the solution set from Activity 1. What is the dimensionof KER? Explain your reasoning.

5. Construct the set IMAGESPACE of all vectors of the form T (u), where uis in (Z5)5, and T is the func you constructed in Activity 2.

6. Take T from Activity 3, and rewrite the output vector in the form of amatrix product, that is, express T in the form T (u) = A ·u, where A isa 4× 5 matrix whose entries in column j correspond to the coefficientsof component j of the output vector. Then, complete the followingsteps.

(a) Think of each column of A as a vector in (Z5)4. Apply the func

All LC you constructed in Activity 4 of Section 4.1 to find the setCOLS of all linear combinations of the columns of A.

(b) Use ISETL to determine whether IMAGESPACE is a subset of COLS,and vice-versa. What is the relationship between the subspacegenerated by the columns of the matrix A, given by COLS, and theset IMAGESPACE?

(c) Apply T to the coordinate basis

{〈1, 0, 0, 0, 0〉 , 〈0, 1, 0, 0, 0〉 , 〈0, 0, 1, 0, 0〉 ,〈0, 0, 0, 1, 0〉 , 〈0, 0, 0, 0, 1〉}

of (Z5)5. Compare the image of each basis vector with the columnsof A. What do you observe?


7. Take the set you constructed in Activity 4, and extend it to a basis forall of (Z5)5. Apply T to each newly constructed basis vector. ApplyAll LC to the set of all such T (u). Call the resulting set IMB. UseISETL to determine the relationship between IMAGESPACE and IMB. IsIMAGESPACE a subset of IMB? Is IMB a subset of IMAGESPACE?

8. What is the relationship between the dimensions of (Z5)5, KER, andIMAGESPACE? Make a conjecture about a general relation between thesedimensions.

9. Consider AX = B, where A is the matrix of coefficients of the leftside of the system in Activity 1, B is a 4 × 1 matrix representing the

constants on the right hand of the equals sign, and

x1

x2

x3

x4

.

(a) Determine the solution set; that is, find all vectors X such thatAX = B. Select one such solution, and call it vp.

(b) Determine the solution set of AX = 0, where 0 denotes the 4× 1matrix whose entries are all zero. Find a particular solution, andcall it v0.

(c) Does the vector vp + v0 form a solution of AX = B? Select adifferent solution to AX = 0, replace v0 with this new solution,and check whether v0 + v0 is a solution to AX = B.

(d) Can you find a solution of AX = B that is not of the form vp+v0?

Discussion

The Kernel of a Linear Transformation

Let T : U −→ V be a linear transformation between two vector spaces Uand V . In this section, we will introduce two subspaces that are associatedwith a linear transformation, the kernel, which is a subspace of U , and theimage space, a subspace of V .

Definition 5.2.1. Let U and V be vector spaces with scalars in K, and letT : U −→ V be a linear transformation. The kernel of T , denoted ker(T ), is


the set of all vectors in U that are mapped to the zero vector in V under T .In symbols,

ker(T ) = {u ∈ U : T (u) = 0V }.

U V

0Vker(T)

T

Figure 5.6: The kernel of T

What is the relationship between the kernel of a linear transformationand the set KER you constructed in Activity 2(a)? What did you find in part(b), when you applied the func is subspace to KER? Before going on to thenext theorem, think about whether the kernel of a linear transformation is asubspace.

Theorem 5.2.1. Let U and V be vector spaces with scalars in K, and letT : U −→ V be a linear transformation. The kernel of T , ker(T ), is asubspace of U .

Proof. In order to show that ker(T ) is a subspace of U , we must show thatthe sum of any two vectors in ker(T ) is an element of ker(T ) and that thescalar product of a vector in ker(T ) is an element of ker(T ). The details areleft as an exercise. See Exercise 9.

Every subspace of a vector space contains the zero vector as an element.Since ker(T ) is a subspace of the domain, the zero vector must be an elementof the kernel. Consequently, the zero vector of the domain is mapped to thezero vector of the range space. Is this true for any function?


Theorem 5.2.2. Let U and V be vector spaces with scalars in K, and letT : U −→ V be a linear transformation. The zero vector of U is mapped tothe zero vector of V under T , that is, T (0U) = 0V , where 0U denotes thezero vector in U , and 0V represents the zero vector in V .

Proof. See Exercise 10.

In Activity 3, you were asked to compare the sets SOL and KER. If youcompare the system of equations given in Activity 1 and the expression forT described in Activity 2, you will notice that the left hand side of eachequation is one of the components of the expression for T . This suggeststhat the kernel of a linear transformation between spaces of tuples can befound by solving a homogeneous system of equations. Generally speaking, ifT : Kn −→ Km is given by

T (〈x1, x2, . . . , xn〉) =⟨a11x1 + a12x2 + · · ·+ a1nxn,

a21x1 + a22x2 + · · ·+ a2nxn, . . . ,

am1x1 + am2x2 + · · ·+ amnxn⟩,

then the kernel, defined to be the set

ker(T ) = {〈x1, x2, . . . , xn〉) ∈ Kn : T (〈x1, x2, . . . , xn〉) = 〈0, 0, . . . , 0〉},

is the solution set of the system

a11x1 + a12x2 + · · ·+ a1nxn = 0

a21x1 + a22x2 + · · ·+ a2nxn = 0...

am1x1 + am2x2 + · · ·+ amnxn = 0.

Conversely, the solution set in Kn of the homogeneous system

b11x1 + b12x2 + · · ·+ b1nxn = 0

b21x1 + b22x2 + · · ·+ b2nxn = 0...

bm1x1 + bm2x2 + · · ·+ bmnxn = 0


is the kernel of the linear transformation T : Kn −→ Km defined by

T (〈x1, x2 . . . , xn〉) = 〈b11x1 + b12x2 + · · ·+ b1nxn,

b21x1 + b22x2 + · · ·+ b2nxn, . . . ,

bm1x1 + bm2x2 + · · ·+ bmnxn〉 .

This explains why KER and SOL were equal. Since ker(T ) is a subspace of U ,the relationship shown above suggests that the solution set of a homogeneoussystem is also a subspace. This is indeed the case.

Theorem 5.2.3. The solution set of a homogeneous system of m equationsin n unknowns is a subspace of the vector space V = Kn.

Proof. Let

a11x1 + a12x2 + · · ·+ a1nxn = 0

a21x1 + a22x2 + · · ·+ a2nxn = 0...

...

am1x1 + am2x2 + · · ·+ amnxn = 0

be a homogeneous system of m equations in n unknowns over K. Supposethat u = 〈u1, u2, . . . , un〉 and v = 〈v1, v2, . . . , vn〉 are solutions. Then, forevery i = 1, 2, . . . ,m,

ai1(u1 + v1) + ai2(u2 + v2) + · · ·+ ain(un + vn)

= ai1u1 + ai1v1 + ai2u2 + ai2v2 + · · ·+ ainun + ainvn

= [ai1u1 + ai2u2 + · · ·+ ainun] + [ai1v1 + ai2v2 + · · ·+ ainvn] = 0,

which shows that u + v is a solution of the system. In a similar manner,we can show that if c ∈ K is a scalar and if v is a solution, then the scalarproduct cv is a solution. See Exercise 11.

If U and V are not vector spaces of tuples, the kernel of a linear trans-formation may not be able to be represented by the solution set of a homo-geneous system of equations. Even though no such correspondence exists,Definition 5.2.1 and Theorems 5.2.1 and 5.2.2 still apply.


The Image Space of a Linear Transformation

Let T : U −→ V be a linear transformation between the vector spaces U andV . We will call the set of all inputs U the domain space of T . The vectorspace V will be referred to as the range space of T . The set of all vectors inV that are assigned to at least one vector in U under T is the image spaceof T . We denote the image space of T using the notation

T (U) = {v ∈ V : there exists u ∈ U such that T (u) = v}.

The term image refers to the output of a single vector. If u ∈ U , thenv = T (u) is the image of u under T . In Activity 5, you constructed the setIMAGESPACE, which is, in fact, the image space of T defined in Activity 2.Like the kernel of T , this set is a subspace of V .

Theorem 5.2.4. Let U and V be vector spaces with scalars in K. Let T :U −→ V be a linear transformation. The image space of T , T (U), is asubspace of V .

Proof. In order to show that T (U) is a subspace of V , we must show thatthe sum of any two vectors in T (U) is equal to a vector in T (U) and that thescalar product of a vector in T (U) is equal to a vector in T (U). The detailsare left as an exercise. See Exercise 13.

In the last subsection, we showed that there is a correspondence betweenthe kernel of a linear transformation between tuples and the solution set of ahomogeneous system of equations. There is a similar correspondence betweensystems and the image space. For example, if T : Kn −→ Km is defined by

T (〈x1, x2, . . . , xn〉)= 〈a11x1 + a12x2 + · · ·+ a1nxn,

a21x1 + a22x2 + · · ·+ a2nxn, . . . ,

am1x1 + am2x2 + · · ·+ amnxn〉 ,

then v = 〈v1, v2, . . . , vm〉 ∈ T (Kn) if and only if the non-homogeneous system

a11x1 + a12x2 + · · ·+ a1nxn = v1

a21x1 + a22x2 + · · ·+ a2nxn = v2

...

am1x1 + am2x2 + · · ·+ amnxn = vm


KmKn

VSet of solutionsof T(u) = v

T

image space of T

Figure 5.7: The image space of T

has a solution. If there were no solution, could we say that v ∈ T (Kn)?

If U and V are not vector spaces of tuples, an element of the image spacemay not correspond to the solution set of a particular system of equations.However, the existence of a solution to T (u) = v and the presence of v inT (U) are the same.

Bases for the Kernel and Image Space

In Activity 4, you constructed a basis for SOL. In Activity 7, you extendedthis set to a basis for the entire space (Z5)5. What theorem from Chapter4 allowed you to do this? After having found the image T (u) of each newbasis vector, you applied the func All LC to the set of these images. Didthis set generate the set IMAGESPACE? Is this set independent? If so, canyou describe a procedure for finding the basis of the image space T (Kn) ofa linear transformation T : Kn −→ Km? If not, can you explain how thingsbreak down?

Since the kernel and image space of a linear transformation are subspaces,we can talk about the dimension of each set. After having analyzed therelationship between the dimensions of (Z5)5, IMAGESPACE, and KER in Ac-tivity 8, what did you conjecture, in general, regarding the relationship be-tween the dimensions of Kn, ker(T ), and T (Kn) for a linear transformationT : Kn −→ Km?


Before considering the theorem that addresses your conjecture, we needto introduce two terms, rank and nullity, that are used in this theorem. InChapter 3, you learned that a system of equations, such as

a11x1 + a12x2 + · · ·+ a1nxn = c1

a21x1 + a22x2 + · · ·+ a2nxn = c2

...

am1x1 + am2x2 + · · ·+ amnxn = cm,

can be represented as a matrix equationa11 a12 . . . a1n

a21 a22 . . . a2n...

......

...am1 am2 . . . amn

·x1

x2...xn

=

c1

c2...cm

.

The solution of such a system can be found by transforming the augmentedmatrix

a11 a12 . . . a1n c1

a21 a22 . . . a2n c2...

......

......

am1 am2 . . . amn cm

into reduced echelon form. If you recall, the rank of an augmented matrix,or any matrix for that matter, is defined to be the number of nonzero rowsappearing in its reduced echelon form. In Activity 6, you showed that theimage of each coordinate basis vector under T was a column of the matrix A.The columns of A generated the set IMAGESPACE. The rank of A, as definedin Activity 6, turns out to be equal to the dimension of IMAGESPACE, theimage space of T . This link is the basis for the use of the term rank inTheorem 5.2.5. If you apply elementary row operations to A in Activity 6,can you verify that its rank is equal to the dimension of IMAGESPACE?

The term nullity refers to the dimension of the null space of a lineartransformation. The term null space is simply the kernel. Some texts electto refer to the kernel as the null space. It would be wise for you to be familiarwith both.


Theorem 5.2.5 (Rank and Nullity). Let U and V be finite dimensionalvector spaces with scalars in K, and let T : U −→ V be a linear transforma-tion. Then,

dim[ker(T )] + dim[T (U)] = dim(U).

Proof. Suppose dim(U) = n and dim[ker(T )] = m. Let

{u1,u2, . . . ,um}

be a basis for ker(T ). By Theorem 4.4.10, we can extend this linearly inde-pendent set to a basis

{u1,u2, . . . ,um,um+1,um+1, . . . ,un}

for U . We will show that the set

{T (um+1), T (um+2), . . . , T (um)}

is a basis for T (U).First, we show that this set spans T (U). Let v ∈ T (U). Then, there

exists u ∈ U such that T (u) = v. We can write u as a linear combination ofthe given basis for U . Grouping terms, we have

u = x + y,

where x is a linear combination of {u1,u2, . . . ,um} and y is a linear combi-nation of {um+1,um+2, . . . ,un}. Since {u1,u2, . . . ,um} is a basis for ker(T ),x ∈ ker(T ). Therefore,

v = T (u) = T (x + y) = T (x) + T (y) = T (y).

Since y is a linear combination of {um+1,um+1, . . . ,un}, it follows, from thelinearity of T , that T (y), and hence v, is a linear combination of

{T (um+1), T (um+1), . . . , T (un)}.

Therefore, {T (um+1), . . . , T (un)} forms a spanning set for T (U).Next, we must show that the set {T (um+1), . . . , T (un)} is linearly inde-

pendent. Suppose that

am+1T (um+1) + am+2T (um+2) + · · ·+ anT (un) = 0V.


Since T is a linear transformation, we can write

T (z) = 0V,

where z = am+1um+1 + am+2um+2 + · · ·+ anun. By the definition of kernel,z ∈ ker(T ). Since {u1,u2, . . . ,um} is a basis for ker(T ), there exists a linearcombination of these vectors, say cz such that z = cz. We can rewrite thisexpression as

cz − z = 0U .

Hence, we have a linear combination of {u1, . . . ,um,um+1, . . . ,un} set equalto the zero vector. Since this set is a basis, it follows that the scalars mustbe simultaneously zero. Therefore,

am+1 = am+2 = · · · = an = 0.

As a result, we can conclude that the set

{T (um+1), T (um+2), . . . , T (un)}

is linearly independent. Therefore, the set

{T (um+1), T (um+2), . . . , T (un)}

is a basis for T (U). According to the definition of dimension, we can concludethat T (U) = n−m, which is what we wished to prove.

Theorem 5.2.5 can be used to tie together the notions of invertibility, one-to-one, and onto. A function is one-to-one if every vector in the image spaceis assigned to one and only one vector in the domain space. In symbols, alinear transformation T : U −→ V is one-to-one if, for u1,u2 ∈ U such thatu1 6= u2, it follows that T (u1) 6= T (u2). This can be stated equivalently asT (u1) = T (u2) implies that u1 = u2.

A function is defined to be onto if the image space is equal to the rangespace. In other words, every element of the range space is an image of someelement of the domain. For a linear transformation T , this means that ifv ∈ V , there exists u ∈ U such that T (u) = v.

If you recall from calculus, a function f : D −→ R, where D denotesthe domain space and R represents the range space, has an inverse f−1 :f(D) −→ D, where f(D) indicates the image space of f , if f−1(f(a)) = afor all a ∈ D and f(f−1(b)) = b for all b ∈ f(D). The underlying feature


VU

T

Figure 5.8: A one-to-one transformation

VU

T

Image space of T

Figure 5.9: A onto transformation


which guarantees the existence of f−1 is that each element of f(D) is assignedto one and only element of D under f−1. Can you recall why the functionf : R −→ R defined by f(x) = x2, whose graph is given below, does nothave an inverse? Can you explain this using the graph? Are you able toreason this using the algebraic expression for the function?

Figure 5.10: f(x) = x2

Can you give an example of a function from calculus whose inverse exists?Can you justify your answer using the graph? the algebraic expression?

Working with these examples should remind you that a function has aninverse if and only if the function is one-to-one. Can you prove this? Forlinear transformations, we can make an even stronger statement.

Theorem 5.2.6. Let T : U −→ U be a linear transformation. The followingstatements are equivalent.

1. T is one-to-one.

2. T is onto.

3. T is invertible; that is, there exists T−1 : U −→ U such that

T−1(T (u)) = u and T (T−1(u)) = u

for every u ∈ U .


Proof. Throughout the proof, assume that dim(U) = n.

(1 =⇒ 2:) We will first show that 1 implies 2. We assume that T is one-to-one and use this to show that T is onto. Since T is one-to-one, thekernel of T , ker(T ), consists only of the zero vector. Why? Therefore,dim(ker(T )) = 0. By Theorem 5.2.5, dim(ker(T )) + dim(T (U)) = n, whichimplies that dim(T (U)) = n. Since T (U) is a subspace of U , and since bothhave dimension n, we can conclude that T (U) = U ; the image space is equalto the range space. Hence, T is onto.

(2 =⇒ 3:) We prove that 2 implies 3. We assume that T is onto and use thisto show that T is invertible. Since T is onto, the image space T (U) is equalto the range space U . Therefore, by Theorem 5.2.5, the kernel of T , ker(T ),consists only of the zero vector. In order to define an inverse transformation,the pre-image of each vector v, that is, the set {u ∈ U : T (u) = v}, canconsist of but a single vector. Suppose that there exist u1,u2 ∈ U such thatT (u1) = v = T (u2). By linearity,

T (u1) = T (u2) =⇒T (u1 − u2) = 0U =⇒

u1 − u2 ∈ ker(T ).

Since ker(T ) = {0U}, u1 − u2 = 0V , or u1 = u2: there is but one vectorwhose image is v. Hence, we can define an inverse function.

(3 =⇒ 1:) We prove that 3 implies 1. We assume that T is invertible anduse this to show that T is one-to-one. Suppose that

T (u1) = T (u2)

for some u1,u2 ∈ U . Since T−1 exists, we can write

T−1(T (u1)) = T−1(T (u2)) =⇒u1 = u2,

which is what we wished to prove.

For a linear transformation with the same domain and range spaces, thistheorem establishes the logical equivalence of one-to-one, onto, and invert-ibility: 1 if and only if 2; 1 if and only if 3; and 2 if and only if 3. Although


we did not actually prove if 1 then 2 and if 2 then 1, each of these condi-tional statements holds. The proof of 1 implies 2 is given in the proof of thetheorem. The implication if 2 then 1 holds, because 2 implies 3, 3 implies 1,and 1 implies 2. Using a similar strategy, can you explain why 1 ⇐⇒ 3 and2 ⇐⇒ 3 are true?

From a practical standpoint, the presence of one condition gives the othersas a result. So, if T : U −→ U is a linear transformation that is one-to-one,then T is also both onto and invertible. Speaking of linearity, where was therequirement of the linearity of T used in the proof? And, moreover, if T−1

exists, is it a linear transformation?

In the next chapter, we will prove that every linear transformation be-tween finite dimensional vector spaces has a matrix representation. Thissuggests that we need to consider the issue of rank. What is the relation-ship, if any, between the rank of a linear transformation T : U −→ U , thedimension of U , and whether T is one-to-one, onto, and invertible. CouldTheorem 5.2.6 be “expanded” to include rank(T ) = dim(U) as a fourthequivalent condition?

The General Form of a System of Linear Equations

A system of m equations in n unknowns over a field K, say

a11x1 + a12x12 + · · ·+ a1nx1n = b1

a21x1 + a22x12 + · · ·+ a1nx2n = b2

...

am1x1 + am2x12 + · · ·+ amnxmn = bm,

can be written in terms of the product of a matrix by a vector,a11 a12 · · · a1n

a21 a22 · · · a2n...

......

...am1 am2 · · · amn

·x1

x2...xn

=

b1

b2...bn

.

The coefficients of the system are given by the matrix A, the unknowns bythe vector X, and the constants by the vector B. Specifically, the systemcan be represented by the equation AX = B. The associated homogeneous


system AX = 0 that is mentioned in Activity 9 is nothing more than thesystem

a11 a12 · · · a1n

a21 a22 · · · a2n...

......

...am1 am2 · · · amn

·x1

x2...xn

=

00...0

,

where constants

b1

b2...bn

are replaced by zeros. Given a particular solution

X = vp and X = v0, a solution of the associated homogeneous system, whatdid you find in Activity 9 regarding the vector sum vp + v0? Is it a solutionof AX = B? Can every solution of AX = B be written this way? How doyour findings compare with the statement of the theorem given below?

Theorem 5.2.7. Let AX = B be a system of m equations in n unknownsover a field K. vs is a solution of AX = B if and only if vs = vp +v0, wherevp is a particular solution of AX = B, and v0 is a solution of the associatedhomogeneous system AX = 0.

Proof. (=⇒:) Define T : Kn −→ Km by

T (〈x1, x2, . . . , xn〉) = 〈a11x1 + a12x2 + · · ·+ a1nxn

a21x1 + a22x2 + · · ·+ a2nxn, . . . , am1x1 + am2x2 + · · ·+ amnxn〉 .

As demonstrated in the first three activities, the kernel of T can be found bysolving the homogeneous system

a11 a12 · · · a1n

a21 a22 · · · a2n...

......

...am1 am2 · · · amn

·x1

x2...xn

=

00...0

.

Any solution vs of the systema11 a12 · · · a1n

a21 a22 · · · a2n...

......

...am1 am2 · · · amn

·x1

x2...xn

=

b1

b2...bn

,


is a vector whose components, when substituted for xi, satisfy the systemgiven above, as well as the equation T (x) = b, where b = 〈b1, b2, . . . , bm〉.Let vp denote one particular solution. Using these relationships and thelinearity of T , we can write

T (vs) = T (vp)

T (vs)− T (vp) = 0

T (vs − vp) = 0.

Therefore, vs − vp = v0 ∈ ker(T ). A rearrangement of terms gives us thedesired result: vs = vp + v0.

(⇐=:) Assume that vs = vp + v0, where vp is a particular solution ofAX = B, and v0 is a solution of AX = 0. Then,

A(vs) = A(vp + v0) = A(vp) + A(v0) = B + 0 = B.

Therefore, vs is a solution of AX = B.

This theorem has two important interpretations. In R3, the solution ofthe system

x1 − 2x2 + 3x3 = 1

3x1 − 4x2 + 5x3 = 3

2x1 − 3x2 + 4x3 = 2

is the set {t 〈1, 2, 1〉 + 〈1, 0, 0〉 : t ∈ R}. This is the line through 〈1, 0, 0〉 inthe direction 〈1, 2, 1〉. 〈1, 0, 0〉 is a particular solution of the system. Thesolution set of the associated homogeneous system

x1 − 2x2 + 3x3 = 0

3x1 − 4x2 + 5x3 = 0

2x1 − 3x2 + 4x3 = 0

is given by {t 〈1, 2, 1〉 : t ∈ R}, the line passing through the origin in thedirection 〈1, 2, 1〉. Hence, the solution set of the non-homogeneous system


a

v

tv + atv

Figure 5.11: General solution of a system

can be represented as a translation of the solution set of the homogeneoussystem.

If we define T : R3 −→ R3 by

T (〈x1, x2, x3〉) = 〈x1 − 2x2 + 3x3, 3x1 − 4x2 + 5x3, 2x1 − 3x2 + 4x3〉 ,

the kernel is the solution set of the homogeneous system

x1 − 2x2 + 3x3 = 0

3x1 − 4x2 + 5x3 = 0

2x1 − 3x2 + 4x3 = 0.

〈b1, b2, b3〉 is in the image space if the non-homogeneous system

x1 − 2x2 + 3x3 = b1

3x1 − 4x2 + 5x3 = b2

2x1 − 3x2 + 4x3 = b3

has a solution. If 〈p1, p2, p3〉 is a particular solution of the non-homogeneoussystem, then 〈b1, b2, b3〉 = T (t 〈1, 2, 1〉+〈p1, p2, p3〉): every vector in the imagespace is the image under T of the sum of an element of the kernel and aparticular pre-image.



As discussed earlier, we cannot necessarily find the kernel and the imagespace of a linear transformation in non-tuple contexts by solving a corre-sponding system of equations. However, the basic concept is the same. Tofind the kernel of a linear transformation T : U −→ V , we find all vectorsu ∈ U that are solutions of the equation T (x) = 0V . Similarly v ∈ V isin the image space of T if there exists u such that T (u) = v. As with anylinear transformation between two finite dimensional vector spaces, Theo-rem 5.2.5 applies in non-tuple contexts. This is verified by the proof, whichwas constructed independently of any specific form. In the exercises, you willbe asked to work with transformations between spaces of polynomials andfunctions.

Exercises

1. Let T : R3 −→ R4 be a transformation given by

T (〈x1, x2, x3〉) =

〈x1 + 2x3, 3x2 − 4x3, x1 − x2 + 4x3,−2x1 + x2 − 4x3〉 .

(a) Show that T is a linear transformation.

(b) Find the kernel of T .

(c) Construct a basis for the kernel of T .

(d) Construct a basis for the image space of T .

(e) Verify the rank and nullity theorem for this transformation.

2. Let

x1 + 2x2 − 3x3 = 0

−2x1 + 4x2 − 2 + 6x3 = 0

2x2 − x3 = 0

−4x2 + 2x3 = 0

be a homogeneous system of 4 equations in 3 unknowns.

(a) Find a basis for the solution set.


(b) Find a linear transformation T : R3 −→ R4 whose kernel is equalto the solution set of the system of equations given here.

(c) Find a basis for the kernel of T .

(d) Find a basis for the image space of T .

(e) Verify the rank and nullity theorem for this transformation.

3. Define T : R3 −→ R by

T (〈x1, x2, x3〉) = 3x1 + 2x2 − x3.


(b) Find a basis for the kernel of T .

(c) Find a basis for the image T (R3) of T .

(d) Describe the kernel of T geometrically.

4. Suppose that the image space of a linear transformation F : R4 −→ R4

is spanned by the set

{〈2,−1, 3, 1〉 , 〈1, 0,−2, 4〉 , 〈1,−1, 5,−3〉 , 〈4,−2, 6, 2〉}.

(a) Find an expression for F . (Note: There may be more than oneexpression that satisfies the condition given above. Your job hereis to find one such expression.)

(b) Find a basis for the kernel of F .

(c) Verify the rank and nullity theorem for this transformation.

5. Suppose that the kernel of a linear transformation G : R4 −→ R4 isspanned by the set

{〈2, 3, 1,−1〉 , 〈−1, 4, 5,−2〉 , 〈3,−1,−4, 1〉 , 〈4, 6, 2,−2〉}.

(a) Find an expression for G. (Note: There may be more than oneexpression that satisfies the condition given above. Your job hereis to find one such expression.)

(b) Find a basis for the image space of G.

(c) Verify the rank and nullity theorem for this transformation.


6. Suppose H : R5 −→ R5 is a linear transformation such that ker(H) =3. Is it possible for the image space to be spanned by the set

{〈−1, 2, 4,−2, 3〉 , 〈2, 0,−1, 3,−3〉 , 〈−2, 4, 0, 1, 2〉 , 〈0, 4,−1, 4,−1〉}?

Justify your answer using the Rank and Nullity Theorem.

7. Define∫ 1

0: C∞(R) −→ R by∫ 1

0

f =

∫ 1

0

f(t) dt.

Describe the kernel of∫ 1

0.

8. Let D2 : C∞(R) −→ C∞(R) represent the second derivative. LetI : C∞(R) −→ C∞(R) denote the identity transformation (I(f) = f).Define L : C∞(R) −→ C∞(R) by L = D2 + I. Find the kernel of L.What is the relationship of between the kernel of L and the solutionset of the differential equation f ′′ + f = 0?

9. Provide a proof of Theorem 5.2.1.



12. Find the solution set of the homogeneous system associated with thetransformation T defined by

T (〈x1, x2, x3, x4〉)= 〈x1 − 2x2 + 3x3 + 5x4, x1 − x2 + 8x3 + 7x4,

2x1 − 4x2 + 6x3 + 10x4〉 .

What is the relationship between the solution set of the homogeneoussystem that corresponds to this transformation and the ker(T )?


14. Find the solution set of the system of equations

x1 − x2 + x3 + 2x4 − 2x5 = 1

2x1 − x− 2− x3 + 3x4 − x5 = 3

−x1 − x2 + 5x3 − 4x5 = −3.


Is the vector 〈1, 3,−3〉 an element of the image of the transformationT : R5 −→ R3 defined by

T (〈x1, x2, x3, x4, x5〉)= 〈x1 − x2 + x3 + 2x4 − 2x5,

2x1 − x− 2− x3 + 3x4 − x5,

−x1 − x2 + 5x3 − 4x5〉?

If so, find the set of vectors in R5 whose image under T is the vector〈1, 3,−3〉. If not, explain why the vector 〈1, 3,−3〉 is not in the imageT (R5) of T .

15. Suppose that G : U −→ R6 is a linear transformation. Each part givesa different scenario involving G and U . Answer each question on thebasis of the given scenario.

(a) If dim[T (U)] = 4 and dim[ker(T )] = 3, what would be the dimen-sion of U? Justify your answer.

(b) Is G were onto, what could we conclude about the dimension ofthe domain space U?

(c) If dim[T (U)] = 6 and dim(U) = 6, what could we say about G?Is G one-to-one? onto?

(d) If dim(U) 6= dim(R6), does Theorem 5.2.6 apply? For example, ifG is one-to-one, must G also be onto? If G is onto, is G necessarilyinvertible?

16. Let T : R4 −→ R5 be defined by T (u) = 0 for all u ∈ R4. What is thedimension of the kernel of T? Justify your answer.

17. Let T : Kn −→ Km be defined by

T (〈x1, x2, . . . , vn〉 = 〈a11x1 + a12x2 + · · ·+ a1nxn,

a21x1 + a22x2 + · · ·+ a2nxn, . . . , am1x1 + am2x2 + · · ·+ amnxn〉 .

(a) Write the expression for T in terms of matrices; that is, write theexpression for T as T (x) = A · x.

(b) Show that the jth column of A is the image of the vector T (ej),where ej is the coordinate basis vector where 1 appears in the jth

component.


(c) Show thatI = {T (e1), T (e2), . . . , T (en)},

spans the image space of T .

(d) What does Theorem 5.2.5 say about the rank of A and the num-ber of vectors in I, after I has been “reduced” to a linearly in-dependent set? By “reduced”, we mean the application of Theo-rem 4.3.2.

18. If T : U −→ V is any vector space transformation, show that T isinvertible, that is, there exists T−1 : ImageSpace(T ) −→ U , if and onlyif T is one-to-one.

19. If T : U −→ V is an invertible linear transformation, show that T−1 :ImageSpace(T ) −→ U is linear as well.

20. Define D : PF3(R) −→ PF3(R) by

D(p) = p′

, where p′ is the first derivative of p.

(a) Show that D is a linear transformation.

(b) Find a basis for the kernel of D.

(c) Find a basis for the image of D.

21. Define D2 : PF3(R) −→ PF3(R) by

D2(p) = p′′

, where p′′ is the second derivative of p.

(a) Show that D2 is a linear transformation.

(b) Find a basis for the kernel of D2.

(c) Find a basis for the image of D2.

22. Define H : PF2(R) −→ PF3(R) by

H(p) = x −→ 2p′(x) +

∫ x

0

3p(t) dt,

where p′ denotes the first derivative of p.


(a) Show that H is a linear transformation.

(b) Find a basis for the kernel of H.

(c) Find a basis for the image of H.

23. Define T : PF2(R) −→ R4 by

T (a0 + a1x+ a2x2) = 〈a0, a1, a0 + a1, 0〉 .


(b) Find a basis for the kernel of T .

(c) Find a basis for the image of T .

24. Define F : PF2(R) −→ PF2(R) by

F (p) = q, where q(x) = p(−x) + p(x)

(a) Show that F is a linear transformation.

(b) Find a basis for the kernel of F .

(c) Find a basis for the image of F .

25. Define G : PF2(R) −→ PF2(R) by

G(p) = q, where q(x) = (2x+ 3)p(x).

(a) Show that G is a linear transformation.

(b) Find a basis for the kernel of G.

(c) Find a basis for the image of G.

26. Define H : PF2(R) −→ R by

H(p) = p(1)

(a) Show that H is a linear transformation.

(b) Find a basis for the kernel of H.

(c) Find a basis for the image of H.


27. Use the results of Theorem 5.2.7 to find the general form of a solutionfor the system of equations

3x1 − 3x2 + 3x3 = a

2x1 − x2 + 4x3 = b

3x1 − 5x2 − x3 = c,

where one particular solution is of the form 〈1,−1, 1〉.

28. Find the form of the system of equations whose general solution consistsof the particular solution 〈−1, 3, 4〉 and whose associated homogeneoussystem has solution set generated by the basis

{〈2, 1, 2〉 , 〈−1, 1, 3〉}.

263

5.3 New Constructions from Old

Activities

1. There are four sets of linear transformations given below. Write anISETL func that implements each transformation. Then, constructfour sets, R, S, F, and P, according to the categorizations given below.Save these sets of funcs in a file called LTexamples.

Let R = {Ri : (Z5)2 −→ (Z5)2 : i = 1, 2, 3, 4, 5, 6, 7} be a set oflinear transformations. The expression for each transformation isgiven below.R1(〈v1, v2〉) = 〈v1 + 2v2, 2v1 + 3v2〉R2(〈v1, v2〉) = 〈v1 + 2v2, 4v1 + 3v2〉R3(〈v1, v2〉) = 〈3v1 + v2, 0〉R4(〈v1, v2〉) = 〈3v1, 2v2〉R5(〈v1, v2〉) = 〈v1 + 2v2, v1 + 2v2〉R6(〈v1, v2〉) = 〈2v1 + 2v2, 2v1 + 4v2〉R7(〈v1, v2〉) = 〈4v1 + 2v2, 4v1〉.Let S = {Si : (Z5)2 −→ (Z5)3 : i = 1, 2, 3, 4, 5} be a set of lineartransformations. The expression for each transformation is givenbelow.S1(〈v1, v2〉) = 〈v1 + 2v2, 2v1 + 3v2, 3v1〉S2(〈v1, v2〉) = 〈2v1 + 4v2, v1 + 2v2, 4v1 + 3v2〉S3(〈v1, v2〉) = 〈3v1 + v2, 2v1 + 2v2, 0〉S4(〈v1, v2〉) = 〈3v1, 2v2, 0〉S5(〈v1, v2〉) = 〈3v1 + v2, 3v1, 2v1 + 3v2〉.Let F = {Fi : (Z5)4 −→ (Z5)2 : i = 1, 2, 3, 4} be a set of lineartransformations. The expression for each transformation is givenbelow.F1(〈v1, v2, v3, v4〉) = 〈v1 + 2v2 + v3, 2v1 + v2 + v3〉F2(〈v1, v2, v3, v4〉) = 〈2v1 + v2, 2v2 + v3〉F3(〈v1, v2, v3, v4〉) = 〈v1 + 2v2, v1 + 2v2〉F4(〈v1, v2, v3, v4〉) = 〈2v1, v2〉.Let P = {Pi : (Z5) −→ (Z5)4 : i = 1, 2} be a set of lineartransformations. The expression for each transformation is given


below.P1(〈v1〉) = 〈2v1, v1, 0, v1〉P2(〈v1〉) = 〈v1, 2v1, 2v1, 0〉.

2. (a) Run name vector space by setting K equal to Z5, U equal to(Z5)2 and V equal to (Z5)3. Given S1, as defined in Activity 1,and the scalar 3 ∈ Z5, explain what you think is meant by 3S1,and write a func that implements it. Apply the func is linear

that you constructed in Activity 3 of Section 5.1 to 3S1. Is 3S1 alinear transformation?

(b) Write a func LTsm that accepts a scalar a and a linear transfor-mation F , and returns a func that implements aF .

(c) Assume that name vector space has been run. Write a func thataccepts a set of linear transformations from U to V ; determineswhether the scalar multiple of each transformation in the set islinear; returns true, if each transformation is linear, or false, ifone or more transformations is not linear. Apply your func toeach of the sets R, S, F, and P defined in Activity 1. Note that youwill need to adjust the inputs for name vector space accordingly.State a conjecture that summarizes what you observe.

3. (a) Run name vector space by setting K equal to Z3, U equal to(Z3)4 and V equal to (Z3)2. Given F1 and F2, as defined in Ac-tivity 1, explain what you think is meant by F1 + F2, and write afunc that implements F1 + F2. Apply is linear to F1 + F2. IsF1 + F2 a linear transformation?

(b) Why doesn’t the procedure in part (a) work for F1 + R2? forS1 + R1? Specify condition(s) under which the sum of two lineartransformations is defined.

(c) Write a func LTadd that accepts two linear transformations Aand B; determines whether the sum of A and B is defined; andreturns the sum A+B, if the sum is defined, or OM, if the sum isnot defined.

(d) Form the sum S3 + S4 by applying the func LTadd. Determinewhether the resulting sum is a linear transformation by applyingthe func is linear. What do you observe?

5.3 New Constructions from Old 265

(e) Write a func that assumes that name vector space has been run;accepts a set of linear transformations from U to V ; determineswhether the sum of each pair is linear; returns true, if each sumis a linear transformation, or false, if one or more of the sums isnot a linear transformation. Apply this func to each of the sets R,S, F, and P defined in Activity 1. Note that you will need to adjustthe inputs for name vector space accordingly. State a conjecturethat summarizes what you observe.

4. (a) Apply the func LTadd to S1 and S2 from Activity 1, and determinewhether the resulting sum is equal to either S3, S4, or S5.

(b) Are the funcs R5 and F3 that were defined in Activity 1 equal?

(c) Are the funcs R4 and S4 that were defined in Activity 1 equal?

(d) Write a func is equal that assumes that name vector space hasbeen run; accepts two linear transformations; determines whetherthe two inputs are equal; and returns true, if they are, or false,if they are not.

(e) Use is equal to find all pairs of linear transformations in R whosesum is equal to another linear transformation in R. Repeat for theset F.

5. Let G = {Gi : (Z2)1 −→ (Z2)2 : i ∈ {1, 2, 3}} be a set of transforma-tions with the expression for each Gi given below:

G1(〈v〉) = 〈0, 0〉G2(〈v〉) = 〈v, 0〉G3(〈v〉) = 〈0, v〉G4(〈v〉) = 〈v, v〉.

(a) Write an ISETL func for each transformation. Apply the func

is linear to verify that each transformation is linear.

(b) What are the scalars in Z2? For each transformation, determineeach of its scalar multiples. Use the func Is Equal that you con-structed in Activity 4 to determine whether each scalar multipleis equal to either G1, G2, G3, or G4. What do you observe?


(c) Apply the func LTadd to find the sum Gi + Gj of all possiblecombinations i, j such that i, j ∈ {1, 2, 3, 4}. Apply the func

is equal to determine whether each sum is equal to either G1,G2, G3, or G4. What do you observe?

(d) Apply the func is vector space (See Activity 6, in Section 2.2)to the set G, together with the operations defined on G, as givenabove. Does ISETL return a response consistent with the resultsyou obtained in (b) and (c)?

6. Let T = {Ti : (Z2)2 −→ (Z2)2 : i ∈ {1, 2, . . . , 24}} be a set of transfor-mations with the expression for each Ti given below:

T1(〈v1, v2〉) = 〈0, 0〉T2(〈v1, v2〉) = 〈v1, 0〉T3(〈v1, v2〉) = 〈v1, v1〉T4(〈v1, v2〉) = 〈v1, v2〉T5(〈v1, v2〉) = 〈v1, v1 + v2〉T6(〈v1, v2〉) = 〈v2, 0〉T7(〈v1, v2〉) = 〈v2, v1〉T8(〈v1, v2〉) = 〈v2, v2〉T9(〈v1, v2〉) = 〈v2, v1 + v2〉T10(〈v1, v2〉) = 〈0, v1〉T11(〈v1, v2〉) = 〈v1, v1〉T12(〈v1, v2〉) = 〈v2, v1〉T13(〈v1, v2〉) = 〈v1 + v2, v1〉T14(〈v1, v2〉) = 〈0, v2〉T15(〈v1, v2〉) = 〈v1, v2〉T16(〈v1, v2〉) = 〈v2, v2〉T17(〈v1, v2〉) = 〈v1 + v2, v2〉T18(〈v1, v2〉) = 〈0, v1 + v2〉T19(〈v1, v2〉) = 〈v1, v1 + v2〉T20(〈v1, v2〉) = 〈v2, v1 + v2〉


T21(〈v1, v2〉) = 〈v1 + v2, v1 + v2〉

T22(〈v1, v2〉) = 〈v1 + v2, 0〉

T23(〈v1, v2〉) = 〈v1 + v2, v1〉

T24(〈v1, v2〉) = 〈v1 + v2, v2〉.

Let B = {Bi : (Z2)2 −→ (Z2)2 : i ∈ {1, 2, 3, 4}} be a set of transforma-tions with the expression for each Bi given below:

B1(〈v1, v2〉) = 〈v1, 0〉

B2(〈v1, v2〉) = 〈v2, 0〉

B3(〈v1, v2〉) = 〈0, v1〉

B4(〈v1, v2〉) = 〈0, v2〉.

(a) Apply the func is vector space to determine whether the set T ,together with the operations defined on T , forms a vector space?

(b) Determine whether each transformation Ti can be expressed as anelement of B or a sum of elements of B.

(c) Determine whether the set B is linearly independent. If no, whynot?

(d) Is the set B a basis for the set T? If so, what is the dimension ofthe set T . If not, how does B fail to constitute a basis?

7. (a) Run name vector space by setting K equal to Z3, U equal to(Z3)2, and V equal to (Z3)3. Given R1 and S1, as defined inActivity 1, explain what is meant by S1 ◦ R1, where ◦ indicatesthat R1 is followed by S1. Write a func that implements S1 ◦R1.

(b) Why doesn’t the procedure in part (a) work for R1 ◦ S1? P1 ◦ T1?Under what condition(s) is the composition of two transformationsdefined?

(c) Write a func LTcomp that accepts two linear transformations Aand B; determines whether the composition of A and B is defined;and returns A ◦ B, if the composition is defined, or OM, if thecomposition is not defined.


(d) Run name vector space by setting K equal to Z3, U equal to(Z3)2, and V equal to (Z3)2. Apply the ISETL command arb toU to choose three vectors from (Z3)2. Determine whether R1 ◦R3

applied to each of these three vectors returns the same result asR5. Apply the func is equal to R1 ◦ R3 and R5. Is the resultreturned by is equal consistent with that returned by R1 ◦ R3

and R5 when applied to each of the three vectors selected by arb?

(e) Apply the func is linear to the composition R2 ◦R4. What doyou find?

(f) Write a func that assumes that name vector space has been run;accepts two sets of linear transformations; determines whether thecomposition of a transformation from the first set followed by atransformation from the second set is defined and is linear; returnstrue, if each composition is a linear transformation, or false, ifone or more of the compositions is not a linear transformation,or OM, if the composition operation is undefined. Apply this functo each of the pairs (R,R), (S,R), and (P,F), where R, S, and F

are the sets defined in Activity 1. Note that you will need toadjust the inputs for name name vector space accordingly. Statea conjecture that summarizes what you observe.

(g) Find all pairs A,B in R such that A ◦ B = B ◦ A. Does theequality hold for every pair in R? Describe your observation in asingle sentence using the word ‘commutative’.

Discussion

Scalar Multiple of a Linear Transformation

In Activity 2, you were asked to define the mapping 3S1. In order to get anexpression for this map, we simply multiply each component of the expressionfor S1 by 3:

3S1(〈v1, v2〉) = 3 · 〈v1 + 2v2, 2v1 + 3v2, 3v1〉= 〈3(v1 + 2v2), 3(2v1 + 3v2), 3(3v1)〉= 〈3v1 + v2, v1 + 4v2, 4v2〉 .


One can find the scalar multiple of any transformation in a similar manner,as suggested in the following definition.

Definition 5.3.1. Let T : U −→ V be a transformation between two vectorspaces U and V with scalars in a field K. Given a scalar a ∈ K, the scalarmultiple of T by a, denoted aT , is a transformation that assigns to eachvector u ∈ U a vector aT (u) ∈ V , where T (u) represents the vector assignedto u by T . This is represented by the notation

(aT )(u) = a(T (u)).

In constructing the func is linear in Activity 3 of Section 5.1, you mostlikely checked the single condition,

T (cu1 + du2) = cT (u1) + dT (u2),

or the equivalent, two-part version,

T (cu) = cT (u)

T (u1 + u2) = T (u1) + T (u2),

given in Definition 5.1.1. Is your finding in Activity 2 consistent with thefollowing theorem?

Theorem 5.3.1. Let U and V be vector spaces with scalars in K, and letT : U −→ V be a linear transformation. If a ∈ K is a scalar, then the scalarmultiple aT : U −→ V is a linear transformation.

Proof. Let u1,u2 ∈ U , and let a, c be scalars.

(aT )(cu1) = a(T (cu1))

= a(cT (u1))

= (ac)T (u1)

= (ca)T (u1)

= c(aT (u1))

= c(aT )(u1)

Can you justify each step? To finish the proof, we still need to show that

(aT )(u1 + u2) = (aT )(u1) + (aT )(u2).

This can be done using the same ideas as in the first part of the proof. Youwill be asked to complete this proof as an exercise. See Exercise 3.


The Sum of Two Linear Transformations

In Activity 3(a), you wrote a func to express the sum of the two funcs F1

and F2. As with the sum of any two functions, the sum of two linear trans-formations consists of taking the vector assigned to an input vector u underF1 and adding it to the vector assigned to u under F2. If each transformationis given by an expression, the sum is defined by adding the individual expres-sions. How is the method by which you obtained the expression for F1 + F2

similar to the way in which you found the expression for 3S1 in Activity 2(a)?Your work in the activities and the discussion here should convince you

that the sum is not defined unless the two transformations have the samedomain and range spaces. The sum F1 + R2 given in Activity 3(b) is notdefined, because the domains differ. Can you explain exactly why this is aproblem? The second sum in part (b), S1 + R1, is not defined, because theranges differ. Why must the ranges be the same? Ideas related to the sumof two transformations are summarized in the definition below.

Definition 5.3.2. Let U and V be vector spaces with scalars in K, and letT : U −→ V and F : U −→ V be two transformations. Given u ∈ U , thesum of T and F , denoted

T + F : U −→ V,

is defined by taking the sum of the vector assigned to u under T , T (u), withthe vector assigned to u under F , F (u). The notation for this is

(T + F )(u) = T (u) + F (u).

Is your result in Activity 3 consistent with the following theorem?

Theorem 5.3.2. Let U and V be vector spaces with scalars in K, and letT : U −→ V and F : U −→ V be linear transformations. Then, the sumT + F : U −→ V is a linear transformation.

Proof. Let u1,u2 ∈ U , and let c be a scalar.

(T + F )(u1 + u2) = T (u1 + u2) + F (u1 + u2)

= [T (u1) + T (u2)] + [F (u1) + F (u2)]

= [T (u1) + F (u1)] + [T (u2) + F (u2)]

= (T + F )(u1) + (T + F )(u2)


Can you justify each step? To finish the proof of the theorem, we still needto show that

(T + F )(cu1) = c(T + F )(u1).

This can be done using the same ideas as in the first part of the proof.You will be asked to complete the proof of this theorem as an exercise. SeeExercise 8.

Equality of Linear Transformations

Generally speaking, two functions f and g are equal if their domains areequal, their ranges are equal, and if f and g assign the same output to agiven input. Since linear transformations are functions, the requirement forequality is the same. A linear transformation T : U1 −→ V1 is equal to alinear transformation F : U2 −→ V2 if and only if U1 = U2, V1 = V2, andT (u) = F (u) for every input u. Consider R5 and F3 in Activity 1: eachis given by the same expression, and both range spaces are the same. So,what is the problem? Why are these two transformations not equal to oneanother? Consider R4 and S4 defined in Activity 1: the domain of eachtransformation is the same, and the expression for each transformation lookssimilar. However, R4 6= S4. Can you explain why?

The func is equal you constructed in Activity 4 checks to see whethertwo transformations between finite vector spaces are equal. When you ap-plied is equal in Activity 5(b) and (c), what did you find? Was each com-bination considered in parts (b) and (c) equal to one of the transformationsgiven in the set G? If so, what significance does this have? In particular,does the set G form a vector space?

A Set of Linear Transformations as a Vector Space

The set of transformations G = {G1, G2, G3, G4} given in Activity 5 is theset of all linear transformations between (Z)1 and (Z2)2. In parts (b) and(c), you showed that each scalar multiple and each sum is equal to one of thetransformations in G. Application of the func is vector space confirmedthese findings. Of the vector space axioms, which axioms correspond to whatyou were checking in parts (b) and (c)?

As it turns out, Activity 5 is a specific example of a much more generalresult. In particular, the set of all linear transformations between two vectorspaces U and V , denoted Hom(U, V ), together with transformation addition


and scalar multiplication, forms a vector space. Here, each transformation isa vector, the operation of adding two transformations represents vector addi-tion, and the operation of multiplying a transformation by a scalar representsscalar multiplication. Hom is an abbreviation for the term homomorphism,which will be defined carefully in an abstract algebra course. For our pur-poses, we only need to know that Hom(U, V ) denotes the collection of alllinear transformations between U and V .

Theorem 5.3.3. Let U and V be vectors spaces with scalars in K. Theset of all linear transformations, Hom(U, V ), together with transformationaddition and transformation scalar multiplication, forms a vector space.

Proof. In order to prove this theorem, you will need to check each of thevector space axioms. The full proof is left as an exercise (see Exercise 11),but the following questions and comments are designed to help you to writea complete proof.

Theorem 5.3.1 shows that the scalar multiple of a linear transformationis again a linear transformation. Which vector space axiom is satisfied bythis theorem? Similarly, Theorem 5.3.2 shows that the sum of two lineartransformations is itself a linear transformation. Which vector space axiomis satisfied here?

Since addition of functions is commutative and associative, the additionoperation defined here is both commutative and associative.

The transformation defined by Z(u) = 0V for all u ∈ U is called the zerotransformation.

For F ∈ Hom(U, V ), the transformation −F ∈ Hom(U, V ) denotes itsadditive inverse.

As with any vector space, we can talk about finding a basis. You did justthat in Activity 6. What is the dimension of T? Can you find a basis for thevector space G defined in Activity 5?

Creating New Linear Transformations

In the last subsection, we created a new vector space by considering the setof all linear transformations between two previously defined vector spaces.We can do even more: in particular, we can often define a function betweena vector space of transformations and a vector space of tuples that preserves


linearity. For example, define a function L : G −→ (Z2)2 between the vectorspace G given in Activity 5 and the vector space (Z2)2 by:

L(G1) = 〈0, 0〉L(G2) = 〈1, 0〉L(G3) = 〈0, 1〉L(G4) = 〈1, 1〉 .

Can you show that L is a linear transformation, that is:

L(Gi +Gj) = L(Gi) + L(Gj), where Gi, Gj ∈ G, i, j ∈ {1, 2, 3, 4}L(cGi) = cL(Gi), where c ∈ Z2, and Gi ∈ G, i ∈ {1, 2, 3, 4}?

Can you define a similar linear transformation between the set of all lineartransformations T defined in Activity 6 and (Z2)4? Transformations such asthese will be considered in more detail in Chapter 7.

Compositions of Linear Transformations

In Activity 7, you wrote a func to express the composition S1 ◦ R1 of thetwo funcs R1 and S1 defined in Activity 1. As with any two functions, thecomposition of two linear transformations, say A ◦ B, consists of taking aninput vector u, applying B to u, and then finding the image of B(u) underA, provided that the application of A to B(u) is defined.

VU W

AB

u

B(u)A(B(u))

Figure 5.12: Composition

Why are the compositions R1 ◦ S1 and P1 ◦ T1 that you considered inpart (b) undefined? In general, if F : U −→ V and G : W −→ Z are two


linear transformations, what condition must be satisfied in order to ensurethat the composition G◦F is defined? Under what condition is the operationof composition defined for pairs of transformations from Hom(U, V )? In thiscontext, what does the word “closed” mean in the statement of the theoremgiven below?

Theorem 5.3.4. Let U be a vector space with scalars in a field K. Theset of all linear transformations Hom(U,U) is closed under the operation ofcomposition.


In Activity 7(g), you considered the issue of commutativity. Specifically,you were trying to determine whether A ◦ B = B ◦ A for every pair oftransformations in R. On the basis of your findings in this activity, can youconclude that composition is a commutative operation?

Exercises

1. Let F : R2 −→ R3 be defined by

F (〈v1, v2〉) = 〈v1 − v2, 3v1 + v2, 4v2〉 .

(a) Verify that F is a linear transformation.

(b) Let a ∈ R. Show that aF forms a linear transformation.

2. Justify each step of the proof of Theorem 5.3.1 that establishes

(aT )(cu1) = c(aT )(u1).

3. Complete the proof of Theorem 5.3.1 by showing that

(aT )(u1 + u2) = (aT )(u1) + (aT )(u2).

4. Let T : PF3(R) −→ PF3(R) be defined by

T (p) = q, where q(x) = p(x+ 3),

Let a ∈ R. Show that aT is a linear transformation.


5. Let J : PF2(R) −→ PF3(R) be defined by

J(p) = x −→∫ x

0

p(t) dt

Let a ∈ R. Show that aJ is a linear transformation.

6. Let F1 : R3 −→ R2 be defined by

F1(〈v1, v2, v3〉) = 〈3v2 − 2v3,−4v1 + v2〉 .

Let F2 : R3 −→ R2 be defined by

F2(〈v1, v2, v3〉) = 〈v1 − v3, v2 − v3〉 .

(a) Verify that F1 and F2 are linear transformations.

(b) Show that F1 + F2 is a linear transformation.

7. Justify each step of the proof of Theorem 5.3.2 that establishes

(T + F )(u1 + u2) = (T + F )(u1) + (T + F )(u2).

8. Complete the proof of Theorem 5.3.2 by showing that

(T + F )(cu1) = c(T + F )(u1).

9. Define T, S : PF3(R) −→ PF4(R) be defined by

T (p) = q,where q(x) = p(x+ 3),

S(p) = q,where q(x) = xp(x).

(a) Show that T and S are linear transformations.

(b) Show that T + S is a linear transformation.

10. Define D,D2 : C∞(R) −→ C∞(R) be the first and second derivativeoperators respectively.

(a) Show that D and D2 are linear transformations.

(b) Show that D2 +D is a linear transformation.



12. Show that the set Hom(R,R3) of linear transformations from R to R3,where addition, for T,G ∈ Hom(R,R3), is defined by

(T +G)(v) = T (v) +G(v),

and scalar multiplication, for T ∈ Hom(R,R3) and a ∈ R, is definedby

(aT )(v) = aT (v),

forms a vector space.

13. Let {F1, F2, F3} be transformations in Hom(R,R3), the set of all lineartransformations from R to R3, as defined below:

F1(〈v〉) = 〈v, 0, 0〉F2(〈v〉) = 〈0, v, 0〉F3(〈v〉) = 〈0, 0, v〉.

(a) Show that {F1, F2, F3} spans Hom(R,R3).

(b) Show that {F1, F2, F3} is an independent subset of Hom(R,R3).

(c) Determine the dimension of Hom(R,R3).

14. Find a basis for the set G defined in Activity 5. What is the dimensionof this vector space?

15. Let T be the set of functions defined in Activity 6. Each Ti ∈ T ,i ∈ {1, . . . , 24}, is defined by

Ti(〈v1, v2〉 = 〈av1 + bv2, cv1 + dv2〉 ,

where 〈v1, v2〉 ∈ (Z2)2, and a, b, c, d ∈ Z2. A review of this activityshows that the choices for a, b, c, and d are unique for each i. Forexample, T5 is defined by

T5(〈v1, v2〉) = 〈v1, v1 + v2〉 ,

where a, c, d = 1 and b = 0. Define L : T −→ (Z4)4 by

L(Ti) = 〈a, b, c, d〉 .


(a) Show that L defines a linear transformation.

(b) Determine whether L is one-to-one. See Section 5.2, if you do notremember the definition of one-to-one.

(c) Determine whether L is onto. See Section 5.2, if you do not re-member the definition of onto.

16. Let Hom(R2,R3) be the set of all linear transformations from R2 toR3.

(a) Show that Hom(R2,R3) forms a vector space under transforma-tion addition and transformation scalar multiplication.

(b) Find a basis for Hom(R2,R3).

(c) Define a transformation between Hom(R2,R3) and the vectorspace R6. Is this transformation linear? one-to-one? onto?

17. Let F1 : R3 −→ R3 be defined by

F (〈v1, v2, v3〉) = 〈3v2 − 2v3, v1 + v3,−4v1 + v2〉 .

Let F2 : R3 −→ R3 be defined by

F (〈v1, v2, v3〉) = 〈v1 + v2 − v3, v1 − v2 + v3,−v1 + v2 + v3〉 .

(a) Verify that F1 and F2 are linear transformations.

(b) Find F2◦F1, and verify that the composition is also a linear trans-formation.

18. Let F ∈ Hom(R3,R3) be defined by

F (〈v1, v2, v3〉) = 〈v1 + 2v2 + v3, 2v1 − v2 + 3v3,−v1 − 3v2 − 2v3〉 .

(a) Determine whether F is 1-1 and onto.

(b) Find the dimension of the kernel of F .

(c) If F is 1-1 and onto, find its inverse, and verify that the composi-tion of F and its proposed inverse yield the identity transforma-tion.

19. Let S : U −→ U and R : U −→ U be two invertible linear transforma-tions. Show that (S ◦R)−1 = R−1 ◦ S−1.


20. Let R1 : R2 −→ R2 be a rotation throughπ

4radians, and let R2 :

R2 −→ R2 be a rotation throughπ

2radians.

(a) Show that the composition R2 ◦ R1 is a rotation, and determinethe angle of rotation.

(b) Graph 〈1, 3〉, R1(〈1, 3〉), and (R2 ◦ R1)(〈1, 3〉) on the same set ofaxes. Is the graph consistent with what you have proven in (a)?

(c) Write a general proof: If R1 : R2 −→ R2 is a rotation through θradians, and R2 : R2 −→ R2 is a rotation through φ radians, thenR2 ◦R1 is a rotation through θ + φ radians.

21. Let F1 : R2 −→ R2 be a reflection through the line y =√

3x, and letF2 : R2 −→ R2 be a reflection through the line y = −x.

(a) Show that the composition F2◦F1 is a rotation, and find the angleof rotation.

(b) Graph 〈2, 1〉, F1(〈2, 1〉), and (F2 ◦ F1)(〈2, 1〉) on the same set ofaxes. Is the graph consistent with what you have proven in (a)?

(c) Write a general proof: If F1 : R2 −→ R2 is a reflection throughthe line y = m1x, F2 : R2 −→ R2 is a reflection through the liney = m2x, and m1 6= m2, then F2 ◦ F1 is a rotation through twicethe angle between the two reflecting lines.

22. If T2 : R2 −→ R2 is a rotation about the origin, and T1 : R2 −→ R2 isa reflection with respect to a line through the origin, explain, withoutmaking any computations, why the composition T2 ◦ T1 cannot be atranslation.

23. Write the proof of Theorem 5.3.4. Then, answer the following questionsrelated to the operation of composition.

(a) Is composition associative? That is, given f, g, h ∈ Hom(U,U),does the following equality hold:

h ◦ (g ◦ f) = (h ◦ g) ◦ f?

If the answer is yes, provide a proof. If the equality does not hold,find a counterexample.


(b) Is composition commutative? That is, given f, g ∈ Hom(U,U),does the following equality hold:

g ◦ f = g ◦ f?

If the answer is yes, provide a proof. If the equality does not hold,find a counterexample.

(c) Show that the transformation I : U −→ U given by I(u) = u isan element of the set Hom(U,U). For every F ∈ Hom(U,U), showthat F ◦ I = F = I ◦ F .

24. Let I be the subset of Hom(U,U) that consists of all invertible lineartransformations from U to U .

(a) Is I closed under composition? If so, write a proof. If not, find twoinvertible transformations whose composition is not invertible.

(b) Is I closed under transformation addition? If so, write a proof. Ifnot, find an example of two invertible transformations whose sumis not invertible.

(c) Is I closed under transformation scalar multiplication? If so, writea proof. If not, find an example of a transformation and a scalarwhose resulting product is not invertible.

(d) Does I form a vector subspace of Hom(U,U) under transformationaddition and transformation scalar multiplication? If so, write aproof. If not, provide an explanation.


Chapter 6

Systems, Transformations andMatrices

If you have stayed with us this far, you are in fora treat. We have been making some connectionsbetween the different topics that have beenstudied up until now. Things like how solutionsets to certain systems of equations aresubspaces of particular vector spaces and howthey both can be pictured in geometrically forsmall dimensions. But now we are ready to putit all in a neat and tidy package to be tied with abow. The package in this case is matrices. Youhave already seen how matrices are connected tosystems of equations. In the second section weconnect matrices to linear transformations, andthe package is complete. Wonder what we willdo in Chapter 7?

282

6.1 Vector Spaces of Matrices

Activities

1. Use the Matrix package in ISETL to define the matrices 2M and M+N ,where

M =

(1 0 32 1 4

)and N =

(2 1 23 4 0

)and all arithmetic is done mod 5.

2. Write a func scale mat that accepts a scalar a and a matrix of scalarsM and returns the matrix aM . Write a func add mat which acceptstwo matrices M and N and returns the matrix M + N . Your funcsshould assume that ms and as have been defined to implement scalarmultiplication and scalar arithmetic.

Next define ms and as to implement arithmetic mod 5 and let M andN be as in Activity 1. Check your funcs using the following matrices.

(a) 2M

(b) 3M + 2N

(c) 3(M +N)

3. For matrices M and N given in Activity 1, use scale mat and add mat

from Activity 2 to compute 2N and N + N . What is the relationshipbetween these? Now compute 3M and M + M + M . What is therelationship between these? What axiom for vector spaces is beingdemonstrated here?

4. For the matrix M in Activity 1, determine a matrix Z such thatM + Z = M . For the matrix Z you have just constructed, deter-mine whether the equality P + Z = P holds for all 2 × 3 matrices Pover Z5. Can you describe what such a matrix would look like for n×mmatrices over K? What axiom for vector spaces is being demonstratedhere?

5. For the matrix N given in Activity 1 and the matrix Z you found inActivity 4, determine a matrix −N such that N + (−N) = Z. Writea func neg mat that accepts a matrix P and determines −P such that

6.1 Vector Spaces of Matrices 283

P + (−P ) = Z. Your code should work for R as well as Zp. Whataxiom for vector spaces is being demonstrated here?

6. Use is vector space from Chapter 2 to determine if the collectionof 4 × 3 matrices over Z3 forms a vector space over Z3 (using theappropriate arithmetic). We will denote this vector space as (Z3)4×3.

7. Use is vector space to verify that (Z5)2×2 is a vector space (usingthe appropriate arithmetic). Use is subspace from Chapter 2 to de-termine if the following are subspaces of (Z5)2×2:

(a) The set of all matrices in (Z5)2×2 with the value 0 in the upperleft corner (a11 = 0);

(b) The set of all matrices in (Z5)2×2 with the value 1 in the upperleft corner (a11 = 1);

(c) The set of all matrices in (Z5)2×2 where the upper right corner isequal to the sum of the upper left and lower left corners (a12 =a11 + a21);

(d) The set of all matrices in (Z5)2×2 where the upper right corneris equal to the product of the upper left and lower left corners(a12 = a11a21).

8. Construct a linearly independent set of 3 vectors in (Z3)4×3. Use LI

from Activity 2 of Section 4.2 to verify that the set you have constructedis linearly independent.

9. Find a basis for (Z3)4×3, and confirm that it is a basis by using is basis

from Activity 4 of Section 4.4. What is the dimension of (Z3)4×3? Canyou determine a general method for finding a basis for Km×n? Make aconjecture about the dimension of Km×n.

10. Write a func flatten mat that accepts an n ×m matrix whose (i, j)entry is aij and that returns a vector whose (i−1)m+j entry is equal toaij. In other words, read the matrix left-to-right from top-to-bottom:(

1 2 34 5 6

)−→ [1, 2, 3, 4, 5, 6].

What is the range of flatten mat? What is the dimension of the rangeof flatten mat?

284 CHAPTER 6. SYSTEMS, TRANSFORMATIONS AND MATRICES

11. Use is linear from Chapter 5 to verify that flatten mat is a lineartransformation from (Z3)4×3 −→ (Z3)12. Remember, in order to useis linear, you first need to run name two vector spaces.

12. Verify that flatten mat from (Z3)4×3 −→ (Z3)12 is one-to-one andonto.

13. Write a func trans mat that accepts an n×m matrix whose (i, j) entryis aij and returns a matrix whose (i, j) entry is aji. What is the rangeof trans mat? What is the dimension of the range of trans mat?

14. Assume that (Z5)2×3 and (Z5)3×2 are vector spaces. Use is linear toverify that trans mat from (Z5)2×3 −→ (Z5)3×2 is a linear transforma-tion.

15. Verify that trans mat from (Z5)2×3 −→ (Z5)3×2 is one-to-one and onto.

Discussion

Vector Spaces of Matrices

In Section 3.2, you were introduced to matrices as a computational toolfor determining the solution set of a system of linear equations. Does yourmemory of their definition match the following?

Definition 6.1.1. If S is a set of numbers, an n×m matrix over S (read nby m matrix over S) is a function M : N ×N −→ S. The dimension of ann×m matrix is “n by m”. The element at (i, j), or (i, j) entry of M is equalto M(i, j) and is denoted by mij.

A matrix will often be written as a rectangular array of n rows and mcolumns. The use of capital letters for the matrix and the correspondinglowercase letter for the elements is conventional.

The notation M = (mij)n×m is used to indicate a matrix of dimensionn × m whose (i, j) entry is mij. For example, can you see why the matrixwritten as (i+ j)2×2 is equal to (

2 33 4

)?


Can you write (ij)3×3, (2i)2×3 and (ij)3×1 as arrays of real numbers? Oftenthe dimensions will be omitted from this notation if they are clear from thecontext.

You were introduced to vector spaces in Chapter 2 and explored theconcept further in Chapters 4 and 5. We now examine matrices in thiscontext.

What two operations must be defined on a set if it is to be a vector space?What properties must these operations satisfy? In Activities 1 and 2 you de-fined two matrix operations, scaling and adding. You tested whether theseoperations satisfied various vector space properties in Activities 3 through 5.In Activities 6 and 7, you used the func is vector space to prove that oneparticular collection of matrices, together with scaling and adding, consti-tutes a vector space. Is that which you proved in Activity 6 true for anyparticular collection of matrices? The next theorem addresses this issue.

Theorem 6.1.1. The collection of matrices of dimension n ×m over a setof scalars is a vector space under the operations:

(mij) + (nij) = (mij + nij);

k(mij) = (kmij).

Proof. We prove the distributive properties. A check of the remaining prop-erties is left as an exercise (see Exercise 1).

Given scalars k and l and matrices M = (mij) and N = (nij), we mustshow that k(M + N) = kM + kN and that (k + l)M = kM + lM . Both ofthese results can be obtained by calculation:

k(M +N) = k(mij + ni,j) = (k(mij + nij)) = (kmij + knij) =

= (kmij) + (knij) = k(mij) + k(nij) = kM + kN,

(k + l)M = (k + l)(mij) = ((k + l)mij) = (kmij + lmij) =

= (kmij) + (lmij) = k(mij) + l(mij) = kM + lM.

Throughout the proof of Theorem 6.1.1, parentheses were used in morethan one context. In some cases, they were used as grouping symbols. In


others, they were used as shorthand to denote matrices. Can you distin-guish between these two uses? There was an additional ambiguity involvingaddition and multiplication. In some instances, multiplication refers to anoperation between a scalar and a matrix. In other instances, it refers to anoperation between two scalars. There is an analogous ambiguity in workingwith addition: the same notation indicates both matrix and scalar addition.Can you distinguish between these uses?

The following two questions will be explored in the exercises (see Exer-cises 2 and 3). If V is a vector space, do the n×m matrices over V form avector space? If V is a vector space and S is a set, does the collection of allfunctions from S to V form a vector space? Can you see how each of theseis a generalization of the previous theorem?

Subspaces of Matrices

Because n×m matrices over K form a vector space over K, there is a notionof a subspace of matrices. Do you recall the definition of a subspace andwhat is required to prove that a subset is a subspace? In the case of vectorspaces, it is sufficient to prove that the subset is non-empty and that forevery v,w in the subset and every scalar k, the vector v + kw is also in thesubset (Theorem 2.3.2).

In Activity 7, you discovered that you could create subspaces by requiringthat particular entries be 0. For example, the collection of all matrices whose(1, 1) entry is 0 is a subspace of the vector space (Z5)2×2. The general resultis the following:

Theorem 6.1.2. For any i and j, with 1 ≤ i ≤ n and 1 ≤ j ≤ m, thecollection of all n×m matrices over K whose (i, j) entry is 0 is a subspaceof Kn×m.

Proof. First note that the zero matrix (0) satisfies the conditions and there-fore the subset is non-empty. Next if both M and N have 0 as their (i, j)entry, then the (i, j) entry of M + kN will equal 0 + k0 = 0 and so M + kNis contained in the subset.

You should be able to formulate and prove the more general theoremwhich states that if you have a fixed set of entries required to be 0 you stillhave a vector subspace of Kn×m. Did any of the other subsets of matrices inActivity 7 form subspaces? For those that did, can you formulate a generalstatement like Theorem 6.1.2? This will be explored further in Exercise 4.


Summation Notation

Before continuing with the discussion on matrices, a few words about sum-mation notation are probably appropriate. Do you recall how the notation

3∑i=1

i2

is interpreted?Summation notation uses the capital Greek letter

∑to indicate summa-

tion:u∑i=`

ai = a` + a`+1 + · · ·+ au−1 + au.

The variable i is referred to as the variable of summation (or the index), thenumber ` is the lower bound and the number u is the upper bound. Often thebounds of the summation will be omitted if they are clear from the context.If we wish to perform a double-sum, where the bounds of each summationdo not depend on the other index, we will abbreviate to a single summationsign: ∑

i,j

aij =∑i

∑j

aij =∑j

∑i

aij.

For example, if the context of the summation indicates that 1 ≤ i ≤ 3 and1 ≤ j ≤ 2, then

∑i,j

ij =3∑i=1

2∑j=1

ij =3∑i=1

(i+ 2i) =3∑i=1

3i = 3(1) + 3(2) + 3(3) = 18.

You should change the order of the indices and verify that the value of thatsummation is also 18; in other words, check that

2∑j=1

3∑i=1

ij = 18.

You must be careful when using these abbreviations. For example, canyou explain why the following is incorrect:

5∑i=1

i∑j=1

i× j =∑i,j

i× j?


When expanding a double-summation, you should expand the outer sum firstif the index for the outer sum is used as a bound for the inner sum:

2∑i=1

i∑j=1

(i+ j) =1∑j=1

(1 + j) +2∑j=1

(2 + j) = (1 + 1) + (2 + 1) + (2 + 2) = 9

Dimensions of Matrix Vector Spaces

In Chapter 4, we discussed the dimension of a vector space. Activities 8and 9 asked you to explore the concepts of linear independence and span inthe context of vector spaces of matrices. In Activity 9, you were asked to givethe dimension of Kn×m based on your work with (Z3)4×3. Is your conjectureconsistent with the statement of Theorem 6.1.3?

Theorem 6.1.3. The dimension of Kn×m is nm.

Proof. Define the matrix Eij to be the matrix whose (i, j) entry is 1 andwhose other entries are 0. In functional notation:

Eij(k, `) =

{1 if (k, `) = (i, j)

0 otherwise.

We will show that {Eij | 1 ≤ i ≤ n, 1 ≤ j ≤ m} is a generating set for Kn×m.The proof that this set is independent is left as an exercise (see Exercise 6).

Let M = (mij) be a matrix in Kn×m. We will show that M can be writtenas a linear combination of the set {Eij}:(∑

i,j

mijEij

)(k, `) =

∑i,j

mijEij(k, `) =

= mk,`Ek,`(k, `) +∑

(i,j) 6=(k,`)

mijEij(k, `) =

= mk,` +∑

(i,j) 6=(k,`)

0 = mk,` = M(k, `).

Can you reformulate the proof representing matrices as arrays of num-bers? Which proof is more understandable? Which proof is more detailed?


Linear Transformations of Matrices

Because Kn×m is a vector space over K, it is natural to try to learn aboutlinear transformations whose domain and/or range is a vector space of ma-trices. The first such transformation you worked with flattened a matrix intoa tuple (in Activities 10 through 12). This is referred to as “flattening thematrix.” Be sure to keep the two vector spaces Kn×m and Knm clear: thefirst is a collection of matrices, and the second is a collection of tuples.

Theorem 6.1.4. The flattening map is a linear transformation from Kn×m

to Knm.

Proof. Let M = (mij) and N = (nij) be elements of Kn×m,let a, b be scalarsand F denote the flattening map. We must show that F (aM + bN) =aF (M)+bF (N). We start with the left hand side of the equality: aM+bN =(amij + bnij) and so the (i− 1)m+ jth component of F (aM + bN) will equalamij + bnij.

On the right hand side, the (i− 1)m+ jth component of F (M) will equalmij and the (i− 1)m+ jth component of F (N) will equal nij. Therefore, the(i − 1)m + jth component of aF (M) + bF (N) will equal amij + bnij. Thisproves that F (aM + bN) = aF (M) + bF (N) as needed.

Theorem 6.1.5. The flattening map is one-to-one and onto.

Proof. The inverse of the flattening map can be defined as follows. Start withthe nm-tuple [ak]. For each value of k, with 1 ≤ k ≤ nm, there are uniqueintegers ik and jk such that 1 ≤ ik ≤ n, 1 ≤ jk ≤ m, and k = (ik − 1)m+ jk.The inverse of the flattening map is the matrix whose (ik, jk) entry is equalto ak.

Theorems 6.1.2 and 6.1.3 describe a special relationship between two vec-tor spaces that was defined in Chapter 2. Two vector spaces are isomorphicif there is a one-to-one, onto linear transformation from one space to theother. We will explore this concept a little more in the exercises. What isimportant is that if two vector spaces are isomorphic, their linear structurescorrespond (any true statement about one of them can be translated into atrue statement about the other).

The second linear map we examined (in Activities 13 and 15) is the trans-pose map. It has some of the same features of the flattening map.


Theorem 6.1.6. The transpose map M −→ M t from Kn×m −→ Km×n isan isomorphism of vector spaces.


Exercises


2. If V is a vector space over K, how would you define V n×m. How wouldyou multiply an element of V n×m by an element of K? How would youadd two elements of V n×m? Prove that V n×m is a vector space. Explainhow Theorem 6.1.1 would be a special case of this result (assuming theresult were true).

3. Let S be a set, and V be a vector space over K. Let V S denote theset of all functions from S −→ V . How would you multiply an elementof V S by an element of K? How would you add two elements of V S?Prove that V S is a vector space. Explain how Theorem 6.1.1 is a specialcase of this result (assuming the result were true).

4. Let T be a linear transformation from Kn×m to K. Prove that thecollection of M ∈ Kn×m which map to 0 under T is a subspace ofKn×m.

For example, Activity 7c is one such example where:

T

(a11 a12

a21 a22

)= a12 − a11 − a21

5. Compute the following summations:

(a)∑3

i=1

∑i+1j=1 i+ j

(b) 1 ≤ i ≤ 3, 1 ≤ j ≤ 3 with∑

i6=j ij

(c)∑2

i=1

∑2j=i 2

i3j

6. Prove that the set of matrices {Eij} defined in the proof of Theo-rem 6.1.3 is independent.


7. For each of the following matrices, compute their images under theflattening map.

(a)

(1 3 22 1 3

)

(b)

2 31 20 12 4

(c)

(1 3 2 3 4 23 2 4 2 4 1

)8. For each of the following tuples, compute their images under the inverse

of the flattening map F : (Z5)2×3 −→ (Z5)6.

(a) [1, 4, 3, 4, 0, 2]

(b) [0, 3, 4, 2, 1, 0]

(c) [1, 0, 0, 0, 1, 0]

(d) [0, 3, 2, 4, 1, 3]

9. For each of the following matrices, compute their images under thetranspose map.

(a)

(1 3 12 3 4

)(b)

(2 3 2 1 4 03 4 2 0 2 1

)

(c)

1 22 22 41 01 0


11. Define the trace of an n× n matrix by:

Tr : (mij) −→n∑i=1

mii.


Determine whether the trace is a linear map from the vector space ofn× n matrices over K to the 1-dimensional vector space (K)1.

12. For a 2× 2 matrix over K, define the map:

det : (mij) −→ m11m22 −m12m21.

Determine whether det is a linear map from the vector space of all 2×2matrices over K to the 1-dimensional vector space (K)1.

293

6.2 Transformations and Matrices

Activities

1. For the matrices M and N , given by

M =

1 3 4 1 02 1 4 0 21 3 0 2 1

and N =

2 3 1 3 11 3 2 1 10 0 1 0 0

do the following:

(a) Use matrix row from the Matrix package to obtain the rows ofM as tuples. Determine the dimension of the subspace generatedby these tuples in (Z5)5.

(b) Use matrix col from the Matrix package to obtain the columns ofM as tuples. Determine the dimension of the subspace generatedby these tuples in (Z5)3

(c) What relationship do you find between the dimensions of thesespaces?

2. Write a func called row rank that accepts an n×m matrix M over K,converts the rows of M into tuples, and that returns the dimension ofthe subspace generated by these tuples in Km. The code in row rank

can assume that name vector space has been run for the vector spaceto Km. Let K = Z5, and let M be given by

M =

1 3 4 2 12 1 3 2 31 3 4 2 11 4 2 2 10 0 0 1 0

.

Determine the value of row rank on the following matrices:

(a) M ;

(b) M with the third and fourth rows interchanged;

(c) M with the second row multiplied by 2;


(d) M with the fourth row replaced by the sum of the first and fourthrows;

(e) the matrix obtained by reducing M to echelon form (see Chapter3);

(f) the matrix obtained by reducing M to reduced echelon form (seeChapter 3).

Make a conjecture about the effects of elementary row operations (asdefined in Chapter 3) on the value of row rank.

3. Write a func called col rank that accepts an n×m matrix M over K,converts the columns of M into tuples, and returns the dimension ofthe subspace generated by these tuples in Kn. The code in col rank

can assume that name vector space has been run for the vector spaceto Kn. Let K = Z3, and let M be given by

M =

1 0 1 2 12 1 0 2 01 0 1 2 11 1 2 2 10 0 0 1 0

.

Determine the value of col rank on the following matrices:

(a) M ;

(b) M with the third and fourth rows interchanged;

(c) M with the second row multiplied by 2;

(d) M with the fourth row replaced by the sum of the first and fourthrows;

(e) the matrix obtained by reducing M to echelon form;

(f) the matrix obtained by reducing M to reduced echelon form.

Make a conjecture about the effects of elementary row operations (asdefined in Chapter 3) on the value of col rank.

4. For each system of linear equations over Z3, determine the columnrank of the matrix of coefficients and the number of free variables inthe solution to the system.

6.2 Transformations and Matrices 295

(a)x + y = 2x + 2y = 12x + y = 0

(b)2y + z = 2

x + y + 2z = 1

(c)+ y + z = 0

x + 2y + 2z = 2x = 2

(d)x + y + z = 22x + 2y + 2z = 1

(e)x + y + z + w = 1x + 2y + 2w = 2x + z + w = 0

Given a system of equations, can you determine a relationship betweenthe column rank and the number of free variables?

5. Define a func called mat apply that accepts an n×mmatrixM = (mij)and an m-tuple [xj] and returns an n-tuple whose ith coordinate is equalto

n∑j=1

mijxj.

Your code may assume that ms and as have been defined to implementscalar arithmetic.

For this activity, define ms and as to implement arithmetic mod 3. Foreach matrix M given below, determine the values of mat apply on thetuples e1, e2, and e3. Can you determine how the results relate to theentries in the given matrix?

(a) M =

(2 3 11 3 1

)


(b) M =

1 2 00 3 11 0 1

(c) M =

(2 3 1

)(d) M =

1 0 00 1 00 0 1

6. Write a func called coord that accepts a vector v and an ordered

basis B = [b1, . . . ,bn] and returns the coordinates of v with respectto B. That is, the return value of coord should be the tuple of scalars[a1, . . . , an] such that

v =n∑i=1

aibi.

Your code may assume that name vector spacehas been used to es-tablish the vector space.

Run name vector space to establish the vector space (Z5)3. Let B =[〈1, 0, 0〉 , 〈1, 1, 0〉 , 〈1, 1, 1〉] and C = [〈2, 0, 0〉 , 〈0, 2, 0〉 , 〈0, 0, 2〉]. Deter-mine whether the following are true or false:

(a) The function coord (using the ordered basis B) is a linear trans-formation from (Z5)3 to (Z5)3.

(b) If the ordered basis B is used in both function calls, then coord

and LC (from Activity 2 of Section 4.1) are inverse functions.

(c) The function coord using the ordered basis B and the function LC

using the ordered basis C are inverse functions.

(d) The function obtained by first applying coord using the orderedbasis B and then applying LC using ordered basis C is a lineartransformation.

7. Write a func called matrify that accepts a function T : U −→ V , abasis B = [bj] of U , and a basis C = [ci] of V . The func should returna matrix M = (mij) such that the entries in the jth column are thecoordinates of T (bi) with respect to the basis C.Now run name two vector spaces to set the domain to (Z5)2 and therange to (Z5)3. Use the coordinate bases for the remainder of this activ-ity (see Section 4.4 if you cannot remember the definition of coordinate


bases). For each function T below, compare the value of T (e1) to thevalue obtained by applying M to the vector e1 and the value of T (e2)to the value obtained by applying M to the vector e2.

(a) T (〈x, y〉) = 〈2x+ y, x, x+ y〉

(b) T (〈x, y〉) = 〈x, 0, 2x− y〉

(c) T (〈x, y〉) = 〈2x− y, x+ y, xy〉

What is the relationship between the vectors in the first two cases?Make a conjecture about the property (or lack thereof) that makes thethird case behave differently.

8. This activity uses the scalar field Z5. For each pair of ordered basesgiven below, compare the result of matrify on the linear transforma-tion given by T (〈x, y, z〉) = 〈x+ 2y, y + 2z〉.

(a) B = [e1, e2, e3], C = [e1, e2]

(b) B = [e2, e1, e3], C = [e1, e2]

(c) B = [e1, 2e2, e3], C = [e1, e2]

(d) B = [e1, 2e2, e3], C = [e1, 3e2]

How does changing the order of the domain basis affect the matrix?How does changing the order of the range basis affect the matrix? Howdoes scaling an element of the domain basis affect the matrix? Howdoes scaling an element of the range basis affect the matrix?

9. Let T : (Z5)2 −→ (Z5)3 be defined as T (〈x, y〉) = 〈x+ 2y, x+ 3y, y〉and use the coordinate bases throughout this activity. How do theresults of matrify(T), and matrify(2T) relate to each other? Canyou make a general conjecture about how matrify is affected whenyou scale its input?

Now let S : (Z5)2 −→ (Z5)3 be defined as S(〈x, y〉) = 〈x, y, 2x+ 3y〉.How do the matrices matrify(T), matrify(S), and matrify(T + S)

relate to each other? Can you make a conjecture about how matrify

applied to a sum relates to matrify applied individually to each of theterms of a sum?


10. Use name two vector spaces to set the domain and range to be (Z5)2.For each of the following matricesM , compare the value of col rank(M)

with the rank of the linear transformation which applies M to the vec-tor.

(a) M =

(1 23 2

)(b) M =

(1 23 1

)(c) M =

(0 21 2

)(d) M =

(0 00 0

)Make a conjecture about the relationship between the two ranks.

Discussion

The Rank of a Matrix

In Activity 1, you examined two ways to convert an n × m matrix into aset of tuples (row-by-row or column-by-column). The dimensions of the setsspanned by these tuples provides a significant amount of information abouta matrix. We now provide a name for these numbers.

Definition 6.2.1. Let M be an n × m matrix over K. The dimension ofthe subspace of Km generated by the rows of M is called the row rank ofM . The dimension of the subspace of Kn generated by the columns of M iscalled the column rank of M .

In Activity 1, you also discovered that although the values of n and mmay be different, the row rank and the column rank of the matrix turnedout to be identical. We now work toward proving this result by looking atthe effect of the elementary row operations on the row and column ranks ofa matrix. Can you recall the three elementary row operations?

In Activity 2, you examined the effect of the elementary row operationson the row rank of a matrix. Your work should have led to make a conjectureconsistent with the following theorem.


Theorem 6.2.1. The row rank of a matrix is unaffected by the elementaryrow operations.

Proof. Theorem 4.3.1 proved that two sets generate the same subspace, pro-vided every vector in each set can be written as a linear combination ofvectors in the other set. We will use this to show that the the elementaryrow operations do not affect the subspace generated by the rows of a matrix.

The first elementary row operation is interchanging two rows. In this case,the set of tuples obtained from the rows of the original matrix is identical tothose of the transformed matrix; hence, the spans are the same.

The second elementary row operation is to multiply one row by a scalar.Let i be the row which is multiplied, v be the tuple from row i of the originalmatrix, and k be the scalar. Then the sets are identical except for the tuplefrom the ith row. The original matrix will have the tuple v and the newmatrix will have the tuple kv. However kv is clearly a linear combination ofthe tuples from the original matrix, and v = 1

k(kv) is a linear combination

of the tuples from the new matrix.The third elementary row operation replaces a row i by the sum of row i

with another row j. Let i be the row which is being replaced, vi be the tuplecreated from row i in the original matrix, j be the row which is added to rowi, and vj be the tuple from row j of the original matrix. As in the last case,the set of tuples generated from the new matrix differs by only one tuplefrom the set of tuples generated from the original matrix. The tuple fromrow i of the new matrix, vi + vj, is a linear combination of the tuples fromthe original matrix. Similarly, the tuple vi = (vi + vj) − vj can be writtenas a linear combination of the tuples from the new matrix.

Since the tuples produced by applying elementary row operations can bewritten as linear combinations of the tuples from the original matrix andvice-versa, the subspaces generated by the rows of the transformed matrixand the rows of the original matrix are the same. Therefore, the row rank isunaffected.

In Chapter 3, we transformed matrices into reduced echelon form as ameans of determining the solution set of a system of equations. Here we willuse echelon form, because it provides an easy way to determine the row (andcolumn) rank of a matrix. Do you remember the requirements for a matrixto be in reduced echelon form?

Since a matrix can be transformed into reduced echelon form by usingelementary row operations exclusively, the row rank of a matrix is equal


to the row rank of its corresponding reduced echelon form. As stated inDefinition 6.2.1, the row rank is the dimension of the vector space generatedby the rows of a matrix. However, the nonzero rows of a reduced echelonmatrix are linearly independent (why?). If we put these ideas together, whatis the relationship between the number of nonzero rows of a reduced echelonmatrix and its rank? Given any matrix, how can we find a basis for thevector space generated by its rows?

Although you may have been able to predict the outcome of Activity 2, theresults of Activity 3 may have come as a surprise. These findings demonstratethe following theorem, which is considered one of the deep results in linearalgebra.

Theorem 6.2.2. The column rank of a matrix is unaffected by the elemen-tary row operations.

Although this can be proven directly, an elegant proof (which avoids manyof the calculations) will be presented in Section 6.3. Your conjecture fromActivity 1 follows directly from Theorem 6.2.2.

Theorem 6.2.3. The column rank and the row rank of a matrix M are equal.

Proof. We can assume that M is in reduced echelon form (why?). Denotethe row rank of M by r. Because M is in reduced echelon form, M willhave exactly r nonzero rows of M . We consider the tuples generated by thecolumns of M which have leading entries.

Each of the tuples coming from these columns will contain a single nonzeroentry and this entry will equal 1. This means that these tuples be {e1, . . . er}.

To conclude the proof, every column of M will have zero in all but itstop r positions so the set {e1, . . . er} will generate the subspace generated bythese tuples. This is sufficient to prove that the column rank of M is r.

In the context of systems of equations, the rank of the coefficient matrixcan be related to the number of determined variables in the solution set.This relationship was presented in Theorem 3.2.1 and again in Activity 4.The following theorem restates the result.

Theorem 6.2.4. The number of determined variables in the solution set ofa system of linear equations is equal to the the rank of the coefficient matrixof that system.


Proof. Given a system of equations with coefficient matrix M , we augmentM and then transform the augmented matrix to reduced echelon form. Ev-ery leading entry then becomes a determined variable in the solution of thesystem. From the proof of Theorem 6.2.3, we can see that the number ofleading entries is equal to the rank of the coefficient matrix.

The Matrix of a Linear Transformation

In Chapter 3, you used matrices to solve systems of equations. In Chapter 5,you described systems of equations in terms of linear transformations andfound that every matrix can be used to define a linear transformation. Wecomplete the triangle in this section by showing that every linear transfor-mation can be described in terms of a matrix. In Activity 5, you wrote theexpression for a linear transformation between two vector spaces of tuples asa matrix application.

In order to use this technique on every vector space, you need to repre-sent a vector space as a collection of tuples. This was the purpose of thefuncs in Activity 6. The functions implemented in coord and LC providean isomorphism between an n-dimensional vector space V over K and thevector space of tuples Kn. In some ways, this is precisely the reason orderedbases were defined.

In Activity 7, you probably realized that matrify provided a genericmethod for representing a linear transformation as a matrix application. Thismethod of producing a matrix from a linear transformation is so important,it gets its own definition.

Definition 6.2.2. Let V be a vector space of dimension n, and let B =[bi] be an ordered basis. Then for each vector v, the coordinate vector (orcoordinates) of v with respect to B is defined to be the vector 〈x1, . . . , xn〉in Kn such that

v =n∑i=1

xibi.

Given vector spaces U and V with ordered bases B = [bj] and C, respec-tively, and a linear transformation T : U −→ V , then the matrix represen-tation of T with respect to B and C is the matrix whose jth column is thecoordinate vector of T (bj) with respect to the ordered basis C.

Although the dimensions of the resulting matrix are not explicitly men-tioned, they can be determined based on the dimensions of U and V .


The choice of ordered bases is significant, as was seen in Activity 8. Thisdependence can create ambiguities which make computations very difficultand will be explored more fully in Chapter 7. We will try to alleviate thiswith the following notation. If a tuple-vector is a coordinate vector, then wewill subscript it with the name of the ordered basis. For example, considerthe vector v = 〈1, 2〉 in the vector space (Z5)2 with ordered bases B = [e1, e2]and C = [〈1, 1〉 , 〈0, 1〉]. Then the coordinates of v with respect to B will be〈1, 2〉B. This explains why the basis B is referred to as the coordinate basis.One the other hand, the coordinates of v with respect to C is given by 〈1, 1〉C.In cases where the ordered bases are clear from the context, we will often omitmention of them. One particular case is a vector spaces of tuples where wewill assume the use of the coordinate bases if no other bases are mentioned.

Because there is so much information involved with matrix representa-tions, the following diagram is often used to illustrate the situation.

UOOB��

T // VOOC��

Km M //Kn

In this diagram, the vertical arrows indicate the isomorphism from U to Km

and from V to Kn, which you implemented as coord (with inverse LC). Thesevertical arrows are labeled by the ordered bases used for the isomorphism.The arrow labeled T indicates the linear transformation from U to V , andthe arrow labeled M indicates the application of the matrix representationof T with respect to B and C.

The top row of the diagram presents the linear transformation in terms ofthe vector spaces, while the bottom row of the diagram presents the matrixrepresentation. The vertical arrows represent the choice of coordinates forthe vector space and tie together the two presentations.

This diagram also illustrates an important equality: if we take a vectorin U , follow the arrow labeled B (downward), and then apply the matrixM , we get the same result as if we had first applied T , followed by findingthe coordinate representation in terms of the basis C. This was suggested inActivity 7 and is stated in the following theorem.

Theorem 6.2.5. Let U and V be vector spaces with ordered bases B andC, respectively, and let T : U −→ V be a linear transformation. Given u inU , the coordinate vector of T (u) with respect to C is equal to the result of


applying the matrix representation of T to the coordinate vector of u withrespect to B.

Proof. Let B = [bj] and C = [ci] be the ordered bases. Let [ui] be the coordi-nates of u with respect to B, and let M = (mij) be the matrix representationof T with respect to B and C. Then

T (u) = T

(∑j

ujbj

)=

∑j

ujT (bj) =∑j

uj

(∑i

mijci

)=

∑i,j

ujmijci =∑i

(∑j

mijuj

)ci.

Therefore, the coordinates of T (u) with respect to C is the tuple whose ith

component is∑

jmijuj. This tuple is the same as that obtained by applyingthe matrix M to the tuple [uj].

Given an n × m matrix M over K, there is a linear transformationT : Km −→ Kn obtained by matrix application. A natural question nowarises: is the matrix associated with T equal to the original matrix M . Theanswer is contained in the following theorem.

Theorem 6.2.6. Let M be an n×m matrix over K and T : Km −→ Kn thelinear transformation mapping v to Mv. Then the matrix of T with respectto the coordinate bases is equal to M .


Properties of Matrix Representations

In Chapter 5, we defined a vector space structure on the set of linear trans-formations from U to V denoted by Hom(U, V ). In Section 6.1, we defineda vector space structure on the set of n × m matrices denoted by Kn×m.Theorem 6.2.5 provides a bridge between linear transformations and matri-ces. A natural question is how the vector space structure on Hom(U, V )relates to that on Kn×m. You explored this relationship in Activity 9. Was


the conjecture you formulated in that activity consistent with the followingtheorem?

Theorem 6.2.7. Let U and V be n-dimensional vector spaces over K. LetB be an ordered basis for U , and let C be an ordered basis for V . Let T and Sbe elements of Hom(U, V ), and let k ∈ K. Assume all matrix representationsare given with respect to the bases B and C. Then, we can conclude:

• the matrix representation of T + S is equal to the sum of the matrixrepresentation of T and the matrix representation of S;

• the matrix representation of kT is equal to the product of k and thematrix representation of T .

Proof. We will prove the result for scalar multiplication, and leave the resultfor addition as an exercise (see Exercise 6). Let T : U −→ V be a lineartransformation, and let k be any scalar in K.To simplify the notation, letB = [bj] and C = [ci] represent the ordered bases of U and V , respectively,and let M = (mij) denote the matrix representation of T with respect to Band C. We now determine the matrix representation of kT with respect toB and C.

For all j, we have:

(kT )(bj) = k(T (bj)) = k(∑

mi,jci

)=∑

(kmij)ci.

Therefore the jth column of the matrix representation of kT is equal to theproduct of k and the jth column of the matrix representation of T . Thiscompletes the proof.

Another place where there have been parallel developments between lineartransformations and matrices has been the value known as rank. Do yourecall the definition for the rank of a transformation and the rank of a matrix?As you discovered in Activity 10, this value is also independent of the matrixrepresentations.

Theorem 6.2.8. The rank of a linear transformation is equal to the columnrank of its matrix representation (with respect to any ordered bases).

Proof. Let T : U −→ V be a linear transformation, and B = [bj] and C beordered bases of U and V , respectively. Assume that the dimension of V isequal to n.


The range of T is spanned by the vectors T (bj). The tuple of coordinatesof the vector T (bj) with respect to the ordered basis C is equal to the jth

column of the matrix. As a result, the range of T is isomorphic to thesubspace of Kn spanned by the column tuples; hence they will have the samedimension.

Theorem 6.2.8 ties the concept of column rank to the concept of the rankof a linear transformation. This result will be instrumental in proving The-orem 6.2.2 using the following strategy. In Section 6.3, we will determinethe linear transformation analog of an elementary row operation. It will beproven that these operations do not affect the rank of the linear transforma-tion and hence will not affect the column rank of the associated matrix. Thisis the missing piece in the proof of Theorem 6.2.2.

Retrospection

We started with the goal of solving systems of linear equations (each of whichcan be written as a single matrix equation). In this task, we developed anabstract theory of vector spaces, bases, and linear transformations. We havenow returned full circle, because every vector space can be represented as avector space of tuples, and every linear transformation can be represented as amatrix application. A reasonable question now is “why bother, if everythingreally is just matrices, to do all of the abstraction?”

By making the dependence on the ordered bases explicit, we have gainedthe freedom to change our choice of ordered bases. In many cases this canhelp simplify a problem at hand (as will be done in Chapter 7). Anotheradvantage is that many of the proofs are actually easier to write if we abstractaway from the details. Perhaps the greatest gain has been in places wherethere are no natural bases. By providing abstract proofs of the results, welearn that many of the techniques which were effective in the concrete caseof tuples can be applied (without change) to the more abstract cases.

Exercises

1. Consider each of the matrices below as being over Z5. Compute theirrow rank. Next, consider each of the matrices below as being over R.Compute their row rank. Do these numbers differ?


(a)

M =

(1 2 3 4 12 3 1 3 2

)(b)

M =

1 0 00 1 00 0 1

(c)

M =

0 0 0 00 0 0 00 0 0 0

(d)

M =

1 2 12 0 31 4 0

(e)

M =

1 2 32 3 01 4 0

2. For each of the following matrices over R, compute its column rank.

(a)

M =

(1 2 3 4 12 3 1 3 2

)(b)

M =

1 0 00 1 00 0 1

(c)

M =

0 0 0 00 0 0 00 0 0 0


(d)

M =

1 2 12 0 31 4 0

(e)

M =

1 2 32 3 01 4 0

3. For each system of equations over R, determine the number of deter-

mined variables. You do not need to solve the equations.

(a)x + 2y + 3z = 4x + 3y − 2z = 23x + 8y + − z = 8

(b)x + y − z = 2x + 2y − 3z = 42x + 7y − 4z = 2

(c)x + y = 2x + 3y = 4x − 2y = 7

4. For each vector space, ordered basis B, and vector v below, write thecoordinates of v with respect to B.

(a) V = (Z5)2, B = [〈1, 1〉 , 〈0, 1〉], and v = 〈1, 3〉(b) V = R3, B = [〈1, 2, 3〉 , 〈2, 4, 1〉 , 〈3, 6, 5〉], and v = 〈0, 4, 1〉(c) V = R2×3, B = [E11, E12, E13, E21, E22, E23], and

v =

(1 3 22 1 0

).

The definition of the Eij can be found in the proof of Theo-rem 6.1.3.


(d) V = PF2(R), the vector space of polynomial functions with degreetwo or less over R, B = [x −→ 1, x −→ x, x −→ x2], v = p, wherep(x) = x2 + 3x− 2.

(e) V = PF2(R), the vector space of polynomial functions with degreetwo or less over R, B = [x −→ 1, x −→ x + 1, x −→ x2 + x + 1]and v = p, where p(x) = x2 − 2x+ 3.

(f) V = PF(Z3) the vector space of all polynomial functions over Z3,B = [x −→ 1, x −→ x, x −→ x2], and v = p, where p(x) = x3.If you consider this impossible, carefully consider the informationabout polynomials in Chapter 1.

(g) V = PF(Z5) the vector space of all polynomial functions over Z5,B = [x −→ 1, x −→ x, x −→ x2, x −→ x3, x −→ x4], and v = p,where

p(x) = 2x7 + 3x6 + 2x5 + 3x4 + x3 + 2x2 + x+ 4.

(h) V = C∞(R), the set of all infinitely differentiable functions overR, B has as its first three elements x −→ 1, x −→ x and x −→ x2,and v = f , where f(x) = 2(x+1)2 +3x−1. You are only requiredto provide a clear description of the coordinates.

(i) V = C∞(R), the set of all infinitely differentiable functions overR, B has as its first two elements sin and cos, v = f , where f isthe function f(x) = sin(x + π). As in the previous item, you areonly required to provide a description of the coordinates.

5. For each of the linear transformations T : U −→ V and ordered basesB and C below, write the matrix of T with respect to B and C.

(a) T : (Z5)2 −→ (Z5)3 defined by

T (〈x, y〉) = 〈2x+ y, x− 3y, 3x〉 ,

and each basis is the appropriate coordinate basis.

(b) T : (Z5)3 −→ (Z5)3 defined by

T (〈x, y, z〉) = 〈x, y, z〉 ,

B the coordinate basis, and C = [〈1, 0, 0〉 , 〈0, 2, 0〉 , 〈1, 0, 1〉].


(c) T : R2 −→ R3 defined by

T (〈x, y〉) = 〈x+ y, 2x− y, x+ 3y〉 .

B = [〈1, 0, 1〉 , 〈0, 2, 3〉 , 〈1, 2, 3〉], and C the coordinate basis.

(d) T : PF2(R) −→ PF2(R) defined by T (p) = q, where q(x) = p(x+1), B = C = [x −→ 1, x −→ x, x −→ x2].

(e) T : PF3(R) −→ PF2(R) defined by T (p) = p′, the derivative ofp, B = [x −→ 1, x −→ x, x −→ x2, x −→ x3] and C = [x −→1, x −→ x, x −→ x2].

(f) V be the vector subspace of C∞(R) spanned by {sin, cos, x −→x, x −→ 1} and T : V −→ V be defined by T (f) = f ′, the deriva-tive of f , B = C = [sin, cos, x −→ x, x −→ 1].



8. For the linear transformations T, S : R3 −→ R3 defined by

T (〈x, y, z〉) = 〈x+ y, y + z, z + x〉S(〈x, y, z〉) = 〈2x, x+ 3y, x+ 2y + z〉 ,

find the matrix representations for the linear transformations belowwith respect to the coordinate bases.

(a) 3T

(b) T + S

(c) 2T + 3S

(d) 3T + 2S

(e) 3(T + 2S)

9. For each linear transformation below, compute its rank.

(a) T : (Z5)3 −→ (Z5)2 given by

T (〈x, y, z〉) = 〈2x+ 3y, 2z + x〉 .


(b) T : (Z5)2 −→ (Z5)2 given by

T (〈x, y〉) = 〈x+ y, x− y〉).

(c) T : R3 −→ R3 given by

T (〈x, y, z〉) = 〈x+ y − z, x+ z, 2y + z〉 .

(d) T : PF4(Z5) −→ PF4(Z5) given by T (p) = q, where q(x) = p(x+1).

(e) T : C∞(R) −→ C∞(R) given by T (f) = f ′, the derivative of f .

311

6.3 Matrix Multiplication

Activities

1. Define the product of M and N by the rule that the jth column ofthe product is equal to the result of applying the matrix M to the jth

column of N . This product is denoted by MN .

For each pair of matrices M and N below, compute MN and NM .Based on these results, make a conjecture about the relationship be-tween MN and NM .

(a)

M =

1 2 32 3 41 3 00 1 0

and N =

2 11 21 1

(b)

M =

1 2 12 2 13 1 1

and N =

1 0 00 1 00 0 1

(c)

M =

0 0 00 0 00 0 00 0 0

and N =

1 2 4 12 3 2 12 1 0 2

(d)

M =

(1 32 1

)and N =

(1 12 3

)2. Define a func called mat mul which accepts two matrices and returns

their product. Your code may assume that as and ms have been defined.

Define as and ms to implement arithmetic mod 5. For each M , N andP given below, compute the values of (MN)P and M(NP ). Basedon these results, make a conjecture about the relationship between(MN)P and M(NP ).


(a)

M =

(1 22 3

)and N =

(2 42 2

)and P =

(1 32 1

)(b)

M =

1 3 22 1 30 1 0

and N =

1 2 32 0 44 4 4

and P =

2 3 12 3 41 0 1

(c)

M =

1 32 31 3

and N =

(2 13 1

)and P =

(21

)

3. Define a func called left mul that accepts a matrix M and returns afunc. The return value of left mul(M) should implement the map:

N −→MN.

Now define as and ms to implement arithmetic mod 3, and let M be

M =

(1 2 32 3 1

).

What are the domain and range of left mul(M)? Is this a linear trans-formation between these spaces?

4. Let I denote the matrix in (Z5)2×2 defined by

I =

(1 00 1

).

For each matrix M in (Z5)2×2, what are the values of MI and IM?Why do you think that the matrix whose (i, j) entry is equal to 1, ifi = j, and equal to 0 otherwise, is called the identity matrix?

6.3 Matrix Multiplication 313

5. Use name two vector spaces to define a domain of (Z5)2 and a rangeof (Z5)3. For each pair of matrices M and N below, compute the resultof applying N followed by applying M to 〈1, 3〉. Then compute theresult of applying MN to the vector 〈1, 3〉. Based on these results,make a conjecture about the connection between the composition ofmatrix applications and the product of matrices.

(a)

M =

1 3 4 12 3 1 01 0 0 1

and N =

2 31 13 00 1

(b)

M =

1 3 22 1 02 3 4

and N =

2 13 40 0

6. Use name two vector spaces to set the domain to (Z3)2 and the range

to (Z3)3. For each pair of linear transformations T, S compute theproduct of the matrix representations of T and S and the matrix rep-resentation of the composition TS. (Use the coordinate bases for allrepresentations.) Based on these results, make a conjecture about theconnection between the composition of linear transformations and theproduct of their matrix representations.

(a)

T (〈x, y〉) = 〈2x+ y, x− y, x+ y〉S(〈x, y, z〉) = 〈x+ y, x− y, x+ z〉

(b)

T (〈x, y〉) = 〈x+ y, x− y, 2x, 3y〉S(〈x, y, z, w〉) = 〈x, y, z〉

7. For each matrix M over Z5, create the augmented matrix (M |I), whereI is the identity matrix of the appropriate dimension (see Activity 4).Use Gaussian Elimination to reduce this augmented matrix to reduced


echelon form, which we denote by (E|N), and then compute MN .What can you say about this product if E is the identity matrix? Whatcan you say about this product if E is not the identity matrix?

(a) M =

(2 17 8

)

(b) M =

1 2 32 4 10 1 0

(c) M =

(0 10 2

)

(d) M =

1 4 22 3 01 2 0

8. Use the technique in Activity 7 to write a func called mat inv that

accepts a matrix M and returns a matrix N such that MN is theidentity matrix. If no such matrix exists, mat inv should return OM.For each matrix M over Z5 given below, compute the column rank ofM and then determine if M has an inverse. Make a conjecture aboutthe relationship between these two results.

(a)

M =

1 3 22 1 43 4 1

(b)

M =

2 1 31 0 10 1 0

(c)

M =

(2 12 0

)(d) (

1 42 3

)


9. For each system of equations below, find the solution set using GaussianElimination. Then find the inverse of the coefficient matrix and applythis matrix to the tuple of constants. Based on your findings, describea method of solving a system of equations using matrix inverses.

(a)x + y − z = 3x − y + z = 0x + 2y + z = 1

(b)x + 2y − 3z = 4− 3y + 2z = 2

x − 4z = 2

(c)x + y = 02x − 2y = 0

10. For each linear transformation from (Z5)3 to (Z5)3, determine if it isinvertible. Then determine if the the image of the coordinate basisunder T is also a basis.

(a) T (〈x, y, z〉) = 〈x+ y, x+ 4y, z〉(b) T (〈x, y, z〉) = 〈x+ 2y + 3z, 2x+ 3y, 3x+ y + 4z〉(c) T (〈x, y, z〉) = 〈x+ y, y + 4z, z + 4x〉(d) T (〈x, y, z〉) = 〈x+ 3y + z, x+ 2y, y + z〉

Discussion

Matrix Multiplication

You might recall that a vector space does not require a multiplication oper-ation between the elements of the vector space, only an addition operationbetween them, and multiplication between an element and a scalar. In Sec-tion 6.1, we focused on the vector space structure of the collection of n×mmatrices over the scalars K and discussed such an addition and multiplica-tion. In this section we will develop an additional structure on matrices: the


ability to multiply two matrices together. We now present a formal definitionof matrix multiplication as developed in Activities 1 and 2.

Definition 6.3.1. Given an n×p matrix M = (mij) and a p×m matrix N =(nij), we define their product to be the n×m matrix MN = (

∑pk=1 miknkj).

One important question to ask about matrix multiplication is how itinteracts with the vector space operations. In Activity 3, you verified thatmatrix multiplication by a given matrix is a linear transformation. Thisresult is true in general.

Theorem 6.3.1. Let M be a p× q matrix over K. Then:

• the map N −→ MN defines a linear transformation from Kq×m −→Kp×m;

• the map N −→ NM defines a linear transformation from Kn×p −→Kn×q.

Proof. We prove the case for the first map. You will be asked to prove thesecond case in the exercises (see Exercise 2). Let M ∈ Kp×q. For N ∈ Kq×m,define T (N) = MN ∈ Kp×m. We must show that for any a, b ∈ K andN,P ∈ Kq×m we have

M(aN + bP ) = T (aN + bP ) = aT (N) + bT (N) = aMN + bMP.

This can be done by direct computation:

M(aN + bP ) = (mij)(anij + bpij) =(∑k

mik(ankj + bpkj)

)=

=

(∑k

amiknkj + bmikpkj

)=

= a

(∑k

miknkj

)+ b

(∑k

mikpkj

)= aMN + bMP.


In Activities 1, 2 and 4, you explored some of the basic properties ofmatrix multiplication. Although you undoubtedly discovered that matrixmultiplication shared some properties with real number multiplication, youfound that there are notable differences. Can you identify some of the sim-ilarities? Can you describe the differences? Some of the similarities arepresented in the following theorem.

Theorem 6.3.2. Let the matrix I be defined by (δij), where δij is equal to 1,if i = j, and δij = 0, otherwise. Let the matrix Z be the zero matrix. If A,B,and C are matrices, then the following equalities hold, provided the productsare defined (in other words, provided the dimensions are correct).

A(B + C) = AB + AC

(A+B)C = AC +BC

k(AB) = (kA)B = A(kB)

AI = A = IA

AZ = Z = ZA

A(BC) = (AB)C

Proof. We will prove some of these and leave others as exercises.

• A(B+C) = AB+AC is valid, because multiplication from the left byA is a linear transformation.

• (A + B)C = AC + BC is valid, because multiplication from the rightby C is a linear transformation.

• A(kB) = k(AB) is valid, because multiplication from the left by A isa linear transformation.

• (kA)B = k(AB) is valid, because multiplication from the right by B isa linear transformation.

• A(BC) = (AB)C is left to the reader (see Exercise 3).


• We will prove AI = A = IA directly.

AI = (aij)(δij) =

(∑k

aikδkj

)=

=

(aijδjj +

∑k 6=j

aikδkj

)=(

aij +∑k 6=j

0

)= (aij) = A.

The proof of A = IA is similar and left to the reader (see Exercise 4.

• AZ = Z = ZA is left to the reader (see Exercise 5).

The statement in the theorem, “provided the products are defined”, isrequired, because not all matrices can be multiplied. What are the require-ments on the dimensions of two matrices that allow them to be multiplied?

The second property which matrix multiplication lacks is commutativity.Even when both AB and BA are defined, they may not even be the samedimension, as you discovered in Activity 1. This also explains why Theo-rem 6.3.2 requires two different distribution laws (one from the right and onefrom the left).

Multiplication as Composition

There is another operation which has similar properties to matrix multipli-cation: composition of linear transformations. Composition is associative,distributes over addition and scalar multiplication, has an identity, and azero-function. Function composition is not always defined and usually non-commutative.

In Activities 5 and 6, you explored the connection between function com-position and matrix multiplication. Although the proofs of the followingtwo theorems are straight-forward, they provide an important connectionbetween matrices and linear transformations, which we will be able to usefor a variety of purposes.


Theorem 6.3.3. Let M be an n× p matrix over K and N a p×m matrixover K. If we start with a tuple x in Km, apply N to this tuple and thenapply M to the resulting tuple in Kp, then the final tuple in Kn will be equalto the tuple obtained by applying MN to the original tuple in Km. In short:M(Nx) = (MN)x.

Proof. Let x = 〈x1, . . . , xm〉 be any vector in Km.

The result of applying the matrix N to x will have∑j

nkjxj as its kth

component. We then apply the matrix M to this and the result will have∑k

mik

(∑j

nkjxj

)as its ith component.

The result of applying the matrix MN to the vector x will have

∑j

(∑k

miknkj

)xj

as its ith component. What remains to be shown is that these two are equal.

∑k

mik

(∑j

nkjxj

)=∑k,j

miknkjxj =∑j

(∑k

miknkj

)xj

This analogous relationship also holds for matrix representations of alinear transformation, as the following theorem shows:

Theorem 6.3.4. Let T : Kp −→ Kn and S : Km −→ Kp be linear transfor-mations. The matrix representation of T ◦ S is equal to the product of thematrix representation of T and the matrix representation of S.

Proof. Let the matrix representation of T be given by (tij) and the matrixrepresentation of S be given by (sij). We find the column tuples of the matrix


representation of T ◦ S by applying T ◦ S to the basis elements of Km.

T ◦ S(ej) = T

(∑k

skjek

)=

=∑k

skjT (ek) =∑k

skj

(∑i

tikei

)=

=∑i,k

tikskjei =∑i

(∑k

tikskj

)ei

This means that the matrix representation of T ◦S is the matrix (∑

k tikskj),which completes the proof.

The result of these two theorems is that (with the extra baggage of orderedbases) matrix multiplication and transformation composition are just twoviews of the same process. This can be described using the following diagram.

UOO

��

S //

T◦S

!!VOO

��

T //WOO

��Km N //

MN

==Kp M //Kn

The top row of the diagram illustrates the composition of linear transfor-mations between vector spaces and the bottom row illustrates matrix multi-plication. As always in these diagrams, the link between the top and bottomrows is the choice of ordered bases.

Invertible Matrices and Change of Bases

In Activity 4, you discovered the identity matrix I. What is special abouta matrix of this form? In Activities 7 and 8, you discovered a method fordetermining the inverse of a matrix (in other words, a method for findinga matrix N such that MN = I). The method suggested in this activity isequivalent to solving the equation MN = I, where I is the identity matrix,and N represents the inverse of M , if the inverse exists. If we denote column


j of N by Nj, then the system of equations given by MNj = Ij is one of nsystems represented by MN = I, which, when expanded, yields n systemsof equations in n unknowns, all of which have the same coefficient matrix.Since the coefficients are the same for each system, we can apply elementaryrow operations to the augmented matrix (M |I) to find the solution set, ifit exists, of each individual system. The func mat inv that you wrote inActivity 8 implements this strategy for determining the inverse.

Theorem 6.3.5. The matrix M is invertible if and only if the matrix ob-tained by reducing M to reduced echelon form is the identity matrix.


A direct consequence of this was investigated in Activity 8 and is pre-sented here as a corollary.

Corollary. An n× n matrix is invertible if and only if its row rank is equalto n.

Proof. This follows quickly from the previous theorem. The row rank of Mis equal to the row rank of the matrix obtained by reducing M to reducedechelon form. If M is invertible, the reduced matrix is I, and the row rankis n. If M is not invertible, the reduced matrix, which is not I, has a row ofall zeros and hence, row rank less than n.

You should see if you can translate the analysis of the paragraph pre-ceding the theorem into statements about linear transformations. Can youuse this technique to describe a general method for finding inverses of lineartransformations?

Theorem 6.3.6. If T is an invertible linear transformation from Kn to Kn,then the matrix representation of T in Kn×n is invertible.

Proof. Let T have inverse S so that T ◦ S = I. Let M be the matrixrepresentation of T and N be the matrix representation of S. Then MN isthe matrix representation of T ◦ S = I which means that MN = I.

In this proof we have used I with two meanings. In one place it refersto the identity linear transformation, and in another it refers to the identitymatrix. Can you identify which one is which in the proof above?


Theorem 6.3.7. Let M be an invertible matrix in Kn×m. The linear trans-formation T : Km −→ Kn defined by

T (x) = Mx

where x ∈ Km, is invertible.


Corollary. An n × n matrix is invertible if and only if its column rank isequal to n.

Proof. Let M be any matrix in Kn×n and T be the linear transformationdefined by applying M . Then the range of T is generated by the columns ofM . This means that the rank of T is equal to the column rank of M . Thecorollary follows because T is invertible if and only if its rank is equal to thedimension of its domain (which is n).

One particular application of inverse matrices was presented in Activity 9,where you used a matrix inverse to find solution set of a system of linearequations. The general technique can be described as follows. Starting withan equation MX = Y , you find the value of X by computing M−1Y whereM−1 is the inverse of M . Written out in notation, the solution should lookvery familiar.

MX = Y

M−1(MX) = M−1Y

(M−1M)X = M−1Y

IX = M−1Y

X = M−1Y

Now that we have a number of conditions for determining the invertibil-ity of a matrix, we focus on one interpretation of invertible transformations(and matrices). When you worked Activity 10, you discovered a connectionbetween changing ordered bases and invertibility. Namely, every time youapply an invertible matrix to an ordered basis, you get another ordered ba-sis. As a result, the linear transformation defined by the application of aninvertible matrix is simply a change of ordered basis. This perspective willallow us to prove the following theorem.


Theorem 6.3.8. If P is an invertible matrix, then the column rank of PMis equal to the column rank of M .

Proof. The key to the proof is that M and PM both represent the samelinear transformation, except with respect to different ordered bases. Sinceboth matrices have rank equal to that of the transformation they represent,M and PM have the same rank.

Let M be an n × m matrix and P be an invertible n × n matrix. LetA be the coordinate basis of Km, B be the coordinate basis of Kn and letthe ordered set C = [ci] be the result of applying the inverse of P to theelements of B. Let T : Km −→ Kn be the linear transformation defined byapplying M with respect to A and B. In the exercises (see Exercise 16), youwill prove that the matrix representation of T with respect to A and B isequal to M . We show that the matrix representation of T with respect to Aand C is equal to PM .

The first step is to prove that C is really a basis. Let S be the lineartransformation defined by the application of the inverse of P . Because theinverse of P is invertible, S is an invertible linear transformation. This impliesthat the range of S has dimension n and so must be all of Kn. However, therange of S is spanned by the n vectors C = {S(ei)} and so this set must belinearly independent by Theorem 4.4.9. As a result, C is a basis of Kn.

Note that because ci is the result of applying the inverse of P to thevector ei, we know that the vector ei is the result of applying P to the ci. Inother words,

ek =∑i

pikci.

To finish the proof, we compute the matrix representation of T withrespect to the ordered bases A and C:

T (ej) =∑k

mkjek =∑k

mkj

(∑i

pikci

)=∑i

(∑k

pikmkj

)ci.

This calculation shows that the matrix representation of T with respect toA and C is equal to (∑

k

pikmkj

)= PM,

which is what was desired.


The proof may be hard to conceptualize, but it can be clarified using thediagram notation.

Km T //OO

B��

T

""Kn I //OO

B��

KnOO

C��

Km M //

PM

<<Km P //Km

At the top of the diagram, there are two arrows labelled with the lineartransformation T , which has a fixed rank. At the bottom of the diagram,the corresponding arrows are labelled M and PM , so the column ranks ofM and PM must be equal.

Exercises 17 and 18 ask you to state and prove the analog of Theorem 6.3.6for M and MP , where P is invertible. A more complete development of thistheory is presented in Chapter 7.

So now we arrive at a position where we are able to prove Theorem 6.2.2.Recall that the theorem states that the column rank of a matrix is unaffectedby the elementary row operations.

Proof of Theorem 6.2.2. Let M be a matrix in Kn×n. We prove that eachelementary row operation on M can be implemented by multiplying M byan invertible matrix on the right. This suffices to prove the result by Theo-rem 6.3.6.

The interchanging of rows i and j can be implemented by multiplying bythe matrix M defined by

mk,` =

1 k = ` except (k, `) = (i, i) and (k, `) = (j, j)

1 (k, `) = (j, i) or (k, `) = (i, j)

0 otherwise

.

The multiplying of row i by the scalar a can be implemented by multi-plying by the matrix M defined by

mk,` =

1 k = ` except (k, `) = (i, i)

a (k, `) = (i, i)

0 otherwise

.


The replacement of row i by the sum of row i and row j can be imple-mented by multiplying by the matrix M defined by

mk,` =

1 k = `

1 (k, `) = (i, j)

0 otherwise

.

We leave as a proof that these matrices implement the appropriate ele-mentary row operations (see Exercises 19 through 21). The fact that theyare invertible follows from the fact that each operation is reversible.

Exercises

1. For each of the matrices M,N over R below, compute the productMN .

(a)

M =

(1 2 32 3 4

)and N =

2 32 13 1

(b)

M =

1 2 30 1 20 0 3

and N =

3 0 02 4 01 3 1

(c)

M =

4 0 00 5 00 0 3

and N =

2 0 00 4 00 0 6


3. Provide the proof of associativity in Theorem 6.3.2: for any three matri-ces A,B, and C, (AB)C = A(BC), whenever the products are defined.

4. Complete the proof of the existence of an identity matrix in Theo-rem 6.3.2: for any matrix A, AI = A = IA, whenever the product isdefined.


5. Provide the proof that multiplication by the zero matrix always resultsin the zero matrix in Theorem 6.3.2: for any matrix A, ZA = Z = AZ,whenever the products are defined.

6. Define T, S : R3 −→ R2 by

T (〈x, y, z〉) = 〈x+ 2y, z〉S(〈x, y, z〉) = 〈y − z, z − x〉 .

Compute the matrix representations of the following linear transforma-tions with respect to the coordinate bases.

(a) T ◦ S(b) 2T ◦ 3S

(c) T ◦ (2S + T + I)



9. Assume that M and N are square matrices. Prove that if M and Nare invertible, then MN is invertible.

10. Assume that M and N are square matrices. Prove that if MN isinvertible, then M and N are both invertible.

11. For each linear transformation T over R given below, compute its in-verse.

(a) T (〈x, y〉) = 〈x+ y, x− y〉(b) T (〈x, y, z〉) = 〈x+ 2y, 3x− z, 4x+ 2y + 3z〉(c) T (〈x, y, z, w〉) = 〈x+ y + z, y + z, z + w, x+ w〉

12. Solve each system of equations over R given below. (Hint: Are theresimilarities between these systems’ coefficient matrices that can be ex-ploited?)

(a)x + 7y + z = 142x + 2y − 10z = 4

3y + 4z = 7


(b)

x + 7y + z = 22x + 2y − 10z = 3

3y + 4z = 2

(c)

x + 7y + z = 02x + 2y − 10z = 0

3y + 4z = 0

(d)

x + 7y + z = 72x + 2y − 10z = 1

3y + 4z = 7

(e)

x + 7y + z = 12x + 2y − 10z = 1

3y + 4z = 1

(f)

x + 7y + z = 22x + 2y − 10z = 4

3y + 4z = 6

(g)

x + 7y + z =√

22x + 2y − 10z = π

3y + 4z = e

13. Solve each system of equations over R below. (Hint: Are there similari-ties between these systems’ coefficient matrices that can be exploited?)

(a)

2x + 3y + 2z + w = 3x + y + z = 2

2y + 2z = 33z + 4w = 7


(b)2x + 3y + 2z + w = 1x + y + z = 3

2y + 2z = 43z + 4w = 4

(c)2x + 3y + 2z + w = 1x + y + z = 1

2y + 2z = 03z + 4w = 0

(d)2x + 3y + 2z + w = 2x + y + z = 5

2y + 2z = −83z + 4w = −10

(e)2x + 3y + 2z + w = 4x + y + z = 2

2y + 2z = 33z + 4w = 47

(f)2x + 3y + 2z + w =

√3

x + y + z = π2

2y + 2z = ee

3z + 4w = 0

14. Let D : P2(R) −→ P1(R) be the linear transformation D(p) = p′ thederivative of p. Let J : P1(R) −→ P2(R) be the linear transformationwhere J(p) =

∫ x0p(t) dt.

(a) Compute the matrix representation M of D with respect to theordered bases [1, x, x2] and [1, x].

(b) Compute the matrix representation N of J with respect to theordered bases [1, x] and [1, x, x2].

(c) Compute the matrix product MN , what does this say about thelinear transformation D ◦ J?


(d) Compute the matrix product NM , what does this say about J◦D?

15. Let D : P5(R) −→ P4(R) be the linear transformation D(p) = p′ thederivative of p. Let J : P4(R) −→ P5(R) be the linear transformationwhere J(p) =

∫ x0p(t) dt.

(a) Compute the matrix representation M of D with respect to theordered bases [1, x, x2, x3, x4, x5] and [1, x, x2, x3, x4].

(b) Assume more generally that D : Pn(R) −→ Pn−1(R) is the lineartransformation D(p) = p′ the derivative of p. What is the matrixrepresentation of D with respect to the ordered bases [1, . . . , xn]and [1, . . . , xn−1]?

(c) Compute the matrix representation N of J with respect to theordered bases [1, x, x2, x3, x4] and[1, x, x2, x3, x4, x5].

(d) Assume more generally that J : Pn−1(R) −→ Pn(R) is the lineartransformation J(p) =

∫ x0p(t) dt. What is the matrix represen-

tation of J with respect to the ordered bases [1, . . . , xn−1] and[1, . . . , xn]?

16. Let M be an n×m matrix and T : Km −→ Kn be the linear transfor-mation defined by application of the matrix M . Prove that the matrixrepresentation of T is equal to M . Notice, we have not specified or-dered bases for this exercise. Why was that omission acceptable in thiscase?

17. State the theorem analogous to Theorem 6.3.8 for the matrices MPand M .

18. Prove the theorem you stated in Exercise 17.

19. Let M be defined as

mk,` =

1 k = ` except (k, `) = (i, i) and (k, `) = (j, j)

1 (k, `) = (j, i) or (k, `) = (i, j)

0 otherwise

.

Prove that the the matrix MN is the result of interchanging rows i andj in the matrix N .



mk,` =

1 k = ` except (k, `) = (i, i)

a (k, `) = (i, i)

0 otherwise

.

Prove that the the matrix MN is the result of multiplying row i of thematrix N by a.


mk,` =

1 k = `

1 (k, `) = (i, j)

0 otherwise

.

Prove that the the matrix MN is the result of replacing row i of thematrix N with the sum of rows i and j of the matrix N .

22. Although we have defined a matrix product above, another possibilityfor a product would have been

A⊗B =1

2AB +

1

2BA,

where juxtaposition indicates the product defined in this section. Provethat this new product is linear and commutative. For A, I and Z ofthe same size, show that I ⊗A = A = A⊗ I and Z ⊗A = Z = A⊗Z.Give an example of three matrices A,B and C for which (A⊗B)⊗C 6=A⊗ (B ⊗ C).

331

6.4 Determinants

Activities

1. Define an func called det which accepts a 2×2 matrix M = (mij) andreturns the value m11m22 − m12m21. Your code may assume that as

and ms implement addition and multiplication.

Now define as and ms to implement arithmetic mod 5. For each ofthe following pairs of matrices, compute det(M), det(N) and det(MN).Make a conjecture based on your findings.

(a)

M =

(2 31 4

)and N =

(3 40 1

)(b)

M =

(1 32 1

)and N =

(0 12 4

)(c)

M =

(2 01 3

)and N =

(1 33 4

)

Discussion

You probably noticed that this activity section was rather sparse. Oftenyou may want to determine whether a matrix is invertible, but you are notbe interested in the value of the inverse matrix. The purpose of this section isto provide you with a technique for determining the invertibility of a matrixthrough a single computation. We provide such a function with the followingdefinition.

Definition 6.4.1. If M is a 1×1 matrix, then the determinant of M is m11.The determinant of M is denoted by det(M).

If M is an n× n matrix, the (i, j) cofactor of M is defined as the deter-minant of the (n− 1)× (n− 1) matrix obtained by removing the ith row andjth column from the matrix M . This cofactor is denoted by Mij.


If M is an n × n matrix, the determinant of M is defined as any of thefollowing:

det(M) =n∑j=1

(−1)i+jmijMij

det(M) =n∑i=1

(−1)i+jmijMij

Note that the first formula represents n different summations (one for eachfixed i with 1 ≤ i ≤ n) and the second formula represents n different sum-mations (one for each fixed j with 1 ≤ j ≤ n). The first formula is called theexpansion along the ith row, and the second formula is called the expansionalong the jth column.

We currently do not have the tools to prove the following theorem, butit is important because it states that there is no ambiguity in the abovedefinition.

Theorem 6.4.1. All of the formulas in Definition 6.4.1 produce the samescalar.

The formula for computing the determinants of 2 × 2 matrices was pro-vided in Activity 1 and is worth remembering. We provide it here again.

det

(a bc d

)= ad− bc

The formula for computing determinants of 3 × 3 matrices is slightlymore complicated, but also worth remembering. We provide the formula forexpansion along the first row here:

det

m11 m12 m13

m21 m22 m23

m31 m32 m33

=

m11 (m22m33 −m32m23)−m12 (m21m33 −m31m23)

+m13 (m21m32 −m31m22) .

Since the value of the determinant of a matrix is independent of the rowor column selected when performing a cofactor expansion, the easiest wayto compute the determinants of a matrix M is to find that row or columnwhich contains the largest number of zero entries.

6.4 Determinants 333

Theorem 6.4.2. If Z is a zero matrix, then det(Z) = 0. If I is an identitymatrix, then det(I) = 1.

Proof. See Exercises 2 and 3.

In Activity 1, you discovered one important property of 2 × 2 determi-nants, namely, that the determinant of a product of matrices is the productof the determinants of those matrices. This holds in general, although wewill not prove it in this text.

Theorem 6.4.3. For any pair of n × n matrices M and N , det(MN) =det(M) det(N).

Theorem 6.4.4. The matrix M is invertible if and only if det(M) 6= 0.

Proof. If M is invertible with inverse N , then det(M) det(N) = det(MN) =det(I) = 1. This implies that det(M) 6= 0.

The proof that det(M) 6= 0 implies that M has an inverse can be foundin more advanced linear algebra texts.

As you might have noticed, our coverage of determinants has been short,and many of the theorems have been left unproven. There is enough infor-mation about determinants to fill an entire chapter, but, for the moment, weonly need a few results. Complete proofs would take us far afield.

Exercises

1. Compute the determinants of the following matrices.

(a) M =

(1 32 4

), over R

(b) M =

(2 31 4

), over Z5

(c) M =

2 2 10 1 03 2 1

, over Z5

(d) M =

2 0 01 2 02 1 3

, over R


(e) M =

x 3 22 x 21 2 x

, over R

2. Prove that if Z is the zero matrix, then det(Z) = 0. (Hint: This proofrequires the use of mathematical induction.)

3. Prove that if I is the identity matrix, then det(I) = 1. (Hint: Thisproof requires the use of mathematical induction.)

Chapter 7

Getting to Second Bases

So the grand finale certainly has an interestingtitle. In this last chapter, we want to look atsome of the ways that linear algebra isuseful—other than being a great way to spend asemester with your favorite mathematician.Specifically, we explore the power that isharnessed by using matrix representations forlinear transformations. We revisit basis of avector space and see that by choosing wisely,much work can be avoided! And the fun we willhave with eigenstuff. . .

336

7.1 Change of Basis

Activities

1. Complete parts (a)–(c) for each of the following matrices A = (aij)given below. Use the information obtained in (a)–(c) to answer thequestion posed in (d).

A1 =

1 2 32 3 13 1 2

A2 =

12

18

14

18

0 12

0 12

14

12

0 14

0 13

0 23

A3 =

−1 1 −13 −2 3−5 5 −4

(a) Show that A is invertible by applying a tool from Chapter 6 to

construct its inverse S = (sij).

(b) For any x = 〈x1, x2, . . . , xn〉 ∈ Rn, where n denotes the numberof columns in A, define the vector y = 〈y1, y2, . . . , yn〉 ∈ Rn by

y = S · x.

Select three nonzero vectors x, and use the equation to find threecorresponding vectors y.

(c) Let bj be the vector whose components are the entries of the jth

column of A. Check to see that the sequence B = [b1,b2, . . . ,bn]forms a basis for Rn, and then, for each vector y, compute thesum

z =n∑i=1

yjbj.

Compare each vector z to the vector x to which it corresponds.What do you observe?

7.1 Change of Basis 337

(d) Describe the procedure alluded to in (a)–(c). Given a basis B =[b1,b2, . . . ,bn] for Rn, how do we find the coordinate vector [x]Bof x, that is, the vector 〈s1, s2, . . . , sn〉 whose components are thescalars in the expression

x =n∑i=1

sibi?

2. Construct a func ChangeCoeff that will accept an ordered basis B =[b1,b2, . . . ,bn] for Rn and a vector x ∈ Rn and that returns the co-ordinate vector [x]B = 〈s1, s2, . . . , sn〉 of x. This is the vector whosecomponents are the coefficients of the equation

x =n∑i=1

sibi.

Check your func for each x that you selected in Activity 1.

3. Let B = [b1,b2, . . . ,bn] be an ordered basis for Rn. Let T : Rn −→ Rn

be a linear transformation defined by T (x) = C · x, where each matrixC is defined below. Complete (a)–(d) for each transformation T andbasis B.

C1 =

9 −7 73 −1 3−5 5 −3

B1 is formed from the columns of the matrix A1 defined in Activity 1.

C2 =

1 3 −2 32 −1 5 11 4 0 82 5 −3 0


C3 =

9 1 −1−2 −1 33 5 −4


338 CHAPTER 7. GETTING TO SECOND BASES

(a) Verify that the matrix representation of T with respect to thecoordinate basis is equal to the matrix C.

(b) Let M be the matrix whose jth column is the sequence of compo-nents of the vector bj ∈ B. Show that M is invertible, and findits inverse M−1. Compute M−1CM .

(c) Select two nonzero vectors x ∈ Rn. Apply the func ChangeCoeff

to find the coordinate vector [x]B of x with respect to the basisB. Compute (M−1CM) · [x]B.

(d) Compute T (x). Apply the func ChangeCoeff to find [T (x)]B.Compare (M−1CM) · [x]B and [T (x)]B.

4. Let T : Rn −→ Rn be a linear transformation, and let C be the matrixrepresentation of T with respect to the coordinate basis. Based uponyour experience with Activity 3, construct a func ChangeFromCoordB

that will accept the matrix representation C and an ordered basisB = [b1,b2, . . . ,bn] and return the matrix representation with respectto the basis B. Check your func for each transformation defined inActivity 3.

5. Let T : Rn −→ Rn be a linear transformation, and let B be the matrixrepresentation of T with respect to an ordered basis B. Construct afunc ChangeToCoordB that will accept the matrix representation B ofT with respect to B and return the matrix representation of T with re-spect to the coordinate basis. Check your func for each transformationdefined in Activity 3.

6. In the following examples, try to find a basis that transforms the givenmatrix B to C having the indicated form.

(a) The matrix B is given by

B1 =

(2 25 −1

).

The matrix C is to have a diagonal form, that is, all entries otherthan those on the main diagonal are zero.


(b) The matrix B is given by

B2 =

19 −18 532 −33 1062 −69 22

.

The matrix C is to have a lower triangular form. This means thatall entries “above” the main diagonal are zero.

(c) The matrix B is the matrix C3 from Activity 3. The matrix C isto have a diagonal form.

Discussion

In Chapter 6, you learned how to find coordinate vectors and matrixrepresentations. In both cases, you discovered that neither is unique. Fora vector u ∈ U , the components of its corresponding coordinate vector inKn depend upon the basis selected for U . Similarly, the form of a matrixrepresentation of a linear transformation T : U −→ V depends upon thebases selected for U and V . In this section, we investigate the relationshipbetween representations for different bases. In particular, if u ∈ U , andif B and C are two bases for U , what is the relationship between the twocoordinate vectors [u]B and [u]C? If T : U −→ U is a linear transformationfrom U to itself, how is the matrix representation with respect to B relatedto that for C? In the first subsection, we will discuss change of basis inrelation to coordinate vectors. In the second subsection, we will see how totransform a matrix representation from one basis into another. Throughoutthis chapter, you will be introduced to examples that show why the abilityto change bases is important.

Coordinate Vectors

If V = Kn, and if the given basis is the coordinate basis C = [e1, . . . , en],the coordinate vector [v]C of any v ∈ Kn is simply the vector v itself. Doyou remember what the form of each ei vector is? Can you explain why thecoordinate vector in this case is the same as v?

There are many instances in which we need to work with a basis otherthan the coordinate basis. In such a case, the coordinate vector of v ∈ Kn,as you discovered in Activity 1, is not equal to v. Our interest in this section


is to study the relationship between coordinate vectors and different bases,and to understand the procedure for changing from one basis to another.The func ChangeCoeff that you wrote in Activity 2 involves changing fromthe coordinate basis to a second basis B. Is this func consistent with thefollowing theorem?

Theorem 7.1.1. Given a basis B = [b1,b2, . . . ,bn] for a vector space Kn,let M be the matrix whose jth column entries are the components of thevector bj. Then, the matrix M is invertible, and, given any vector x =〈x1, x2, . . . , xn〉} ∈ Kn, the vector given by

y = M−1 · x

gives the coordinate vector of x with respect to B, that is,

x =n∑i=1

yibi.

Proof. The proof of this theorem is a tour de force of notation togetherwith calculations involving sequences, summations, and multi-indices. Onestrategy in understanding the proof is to take a very specific example andfollow through the formulas with that example. Other than heavy notation,the steps of the argument are not particularly difficult.

According to the Corollary of Theorem 6.3.4, the matrix M defined inthe statement of the theorem is invertible since its row rank is n.

Now we introduce some notation. All indices are assumed to run from 1to n. In the double index for an element of a matrix, the first index countsthe rows, the second indicates the columns. Let M = (tij), M

−1 = (sij), andlet C = [e1, e2, . . . , en] be the coordinate basis for Rn.

Since the entries of the ith column of M are the components of the basisvector bi, M applied to each coordinate basis vector ei yields

bi = M · ei =∑k

tkiek.

Since MM−1 is the identity matrix, we know that the kjth coordinate ofthe product is 0 for all values of the indices, except when k = j, in which casethe entry is 1. A convenient and standard way of expressing this is throughthe symbol δkj called the Kronecker delta. This is nothing more than a


shorthand for the long statement: 0, if k is different from j; 1 otherwise.Thus, we have, ∑

i

tkisij = δkj.

As defined in the activities, let y = 〈y1, y2, . . . , yn〉 be the vector given bythe product of M−1 and x = 〈x1, x2, . . . , xn〉. Using summation notation, wehave the following expression for each component of y,

yi =∑j

sijxj.

In terms of C = [e1, . . . , en], y is of the form

y =∑i

∑j

sijxjei.

Now that we have established these relationships, we are ready to showthat

y = M−1 · x.

Before looking at the explanations that follow, justify, for yourself, each stepof the calculation:∑

i

yibi =∑i

∑j

sijxjMei

=∑i

∑j

sijxj∑k

tkiek

=∑k

(∑j

(∑i

tkisij

)xj

)ek

=∑k

(∑j

δkjxj

)ek

=∑k

xkek

= x.

• In the first line, we substituted M−1 · x for yi, and replaced bi by itsequivalent formulation M · ei.


• In the second line, we expressed M · ei in terms of the entries in ith

column of the matrix M .

• In the third line, we reordered, rearranged, and collected terms in thistriple summation.

• In the fourth line, we replaced the expression for the product MM−1

by its value in terms of the Kronecker delta.

• In the fifth line, we replaced each δkj by dropping all 0 terms.

• In the last line, we noted that the expression was the expansion of x interms of the coordinate basis.

Let’s summarize what we discovered thus far.

1. If V = Kn, the coordinate vector of v ∈ Kn with respect to the coor-dinate basis is equal to v.

2. If B = [b1,b2, . . . ,bn] is any basis, we can find the coordinate vectorof v by computing the product

M−1v,

where M is the matrix whose jth column is the sequence of the com-ponents of the vector bj. From this point forward, we will refer to amatrix such as M as a transition matrix.

3. Given [v]B, the coordinate vector of v with respect to B, we can findv by computing the product

M [v]B.

If B1 and B2 are two bases for V = Kn, and if v ∈ V , what is therelationship between [v]B1 and [v]B2? How do we get from [v]B1 to [v]B2?


from [v]B2 to [v]B1? The diagram given below illustrates these relationships.What are the entries of the transition matrices MB1 and MB2?

[v]B2

MB2M−1B1

''

v

M−1B2

OO

M−1B1��

[v]B1

MB1M−1B2

gg

Starting on the left, we see that multiplying the coordinate vector [v]B2 by thematrix M−1

B1MB2 yields [v]B1 . Note that we are multiplying by the product of

two matrices—the first matrix product goes from B2 to the coordinate basis,and the second goes “back” from the coordinate basis to B1. In the middle,if we multiply the vector v by the matrix M−1

B2, then we get the coordinate

vector [x]B2 . If, on the other hand, we multiply v by the matrix M−1B1

, weget the coordinate vector [v]B1 . On the right, if we start with the coordinatevector [v]B1 and multiply by the matrix M−1

B2MB1 , we get the coordinate

vector [v]B2 . If we start at any “node”, [v]B2 , v, or [v]B1 , we can get to anyother node by following the appropriate arrow, or sequence of arrows. Sincethe coordinate vector of v with respect to the coordinate basis C is equal to vitself, we can actually say that the matrix M−1

B1is the matrix that transforms

[v]C into [x]B1 . What matrix would we use to transform [x]B1 into [x]C? [x]B2

into [x]C? [x]C into [x]B2?

Alias and alibi. There is a point of interpretation that may at first seemconfusing. However, it is interesting and important, because it appears inmany different situations. In particular, we can think of a basis as a frameof reference for locating vectors. If the vector space is Rn, and if the basisis the coordinate basis C = [e1, e2, . . . , en], then the coefficients of a vectorv ∈ Rn with respect to this basis are precisely its coordinates in a coordinatesystem in which the basis vectors are the axes.

The is true of any basis. Consider the basis B = [〈2, 1〉 , 〈−1, 3〉] in R2.The vector 〈1, 2〉 ∈ R2 has the coordinates 1 and 2 with respect to thecoordinate basis. What are its coordinates with respect to the basis B?With respect to B, [〈1, 2〉]B =

⟨57, 3

7

⟩. Can you show how to get this using

Theorem 7.1.1?


Need a picture of a vector in R3 in which the coordinates of thevector are clearly shown to be “arrows” along the coordinateaxes.

Figure 7.1: Basis

A picture in R2 showing the basis vectors b and appropriatelines to e1, e2 indicating the coordinates as lengths and the samething relative to b1,b2.

Figure 7.2: Basis representation

But we could also show this in another way as given in the accompanyingfigure.

Here he have a coordinate system with only b1,b2 as axes,shown in the same position as the coordinate axes are shownand with the new coordinates of x indicated.

Figure 7.3: An alternate representation

Now come the interpretations. In the first figure, we may consider thatthe vector 〈1, 2〉 is unchanged, but it has two “names”: the coordinates 〈1, 2〉and the coordinates

⟨57, 3

7

⟩. This is called the alias interpretation.

On the other hand, the second picture suggests that the vector 〈1, 2〉 hasbeen changed. Originally, it was 〈1, 2〉, but now it is changed to

⟨57, 3

7

⟩. This

is called the alibi interpretation.

Matrix Representations

Every vector in an n-dimensional vector space has a coordinate vector rep-resentation with respect to a given basis. The same is true of linear trans-formations. For instance, if L : U −→ V is a linear transformation and ifB and C are bases for U and V respectively, then there is an m × n matrixA, where dim(U) = n and dim(V ) = m, that represents L in the sense thatif [u]B = 〈s1, s2, . . . , sn〉 is the coordinate basis of u with respect to B and[L(u)]C is the coordinate vector of L(u) with respect to C, then

[L(u)]C = A · [u]B.


The jth column of A is the sequence of coefficients of the vector L(bj) interms of the basis C. We can illustrate this in the figure below.

u L //

��

L(u)

��[u]B

A // [L(u)]C

This diagram tells us that if we take a vector u ∈ U , find its coordinate vector[u]B, and multiply by the matrix representation A, we will get the same resultas if we had first applied L to u and then found the coordinate vector [L(u)]C.In other words, L : U −→ V can be represented by A : Kn −→ Km.

In this subsection, we will limit the discussion to linear transformationsfrom a vector space U to itself. In this context, we will investigate whathappens to a matrix representation when we change the basis. In Activities 3and 4, you considered the process by which one changes from the coordinatebasis to a basis B for a linear transformation between spaces of tuples.

Let’s review the methodology suggested in these activities by consideringan example. Let L : R2 −→ R2 be the linear transformation given by theformula

L(〈x1, x2〉) = 〈3x1 + 2x2, x1 + 2x2〉 .If we work with the coordinate basis C = [e1, e2], the matrix representationof L with respect to C is given by

C =

(3 21 2

).

The matrix representation of L with respect to B = [〈1,−1〉 , 〈2, 1〉] isgiven by

B =

(1 00 4

).

Later in this section and throughout the remainder of this chapter, we willdiscover that diagonal forms are extremely important and useful. The func

func ChangeFromCoordB that you constructed in Activity 4 changes a repre-sentation written in terms of the coordinate basis into a representations withrespect to a basis B. If you applied func ChangeFromCoordB to the matrixC and the basis B given here, would ISETL return the matrix B? Are thecomponent pieces of ChangeFromCoordB consistent with the theorem givenbelow?


Theorem 7.1.2. Let L : Rn −→ Rn be a linear transformation whose matrixwith respect to the coordinate basis is denoted by C. Let B = [b1, . . . ,bn] bean ordered basis for Rn, and let M be the matrix whose jth column is thesequence of coefficients of bj. Then, the matrix B of L with respect to B isgiven by

B = M−1CM.

Proof. The jth column of B consists of the sequence of coefficients of L(bj)in terms of B. We must show that the jth column of M−1CM consists ofthe same sequence of coefficients. Since C is the matrix representation of Lwith respect to C, the jth column of CM is the coordinate vector [L(bj)]C.By Theorem 7.1.1, the product M−1 · [L(bj)]C is [L(bj)]B.

The func ChangeToCoordB that you constructed in Activity 5 reverses theprocess given by Theorem 7.1.2. If we start with the matrix representationin terms of a non-coordinate basis, how to we get to the matrix representa-tion for the coordinate basis? Specifically, how do we represent the matrixrepresentation C in terms of B? Before proceeding further, let’s summarizethe relationship between the coordinate basis and a second basis B.

• If L : Rn −→ Rn is a linear transformation, then the jth column of thematrix representation of L with respect to the coordinate basis C is thecoordinate vector [L(ej)]C.

• If we want to find the matrix of L with respect to B, we first constructa transition matrix M . The jth column of this matrix consists of thecoefficients of the vector bj.

• If we let C denote the coordinate basis representation of L, then thematrix representation with respect to B is found by computing theproduct M−1CM . The jth column of the product is the coordinatevector [L(bj)]B.

• If we are given the matrix representation B with respect to B and wishto find the coordinate basis representation, we compute the productMBM−1. The jth column of MBM−1 is the coordinate vector [L(ej)]C.

If B1 and B2 are two bases, neither of which is the coordinate basis, how dowe use Theorem 7.1.2 to find the transition from B1 to B2, and vice-versa?The diagram below illustrates the relationships involved in making these


transitions. In the figure, v is a vector in Kn, B1 is the matrix representationin terms of B1, and B2 denotes the matrix representation of L with respectto B2.

[v]B1

B1 // [L(v)]B1

v

M−1B1

OO

L //

M−1B2

��

L(v)

M−1B1

OO

M−1B2

��[v]B2

B2 // [L(v)]B2

What are the entries of the transition matrices MB1 and MB2? Using thediagram, we can see that

[L(v)]B2= B2M

−1B2MB1 [v]B1

[L(v)]B2= M−1

B2MB1B1[v]B1 .

Therefore,

B2 = M−1B2MB1B1M

−1B1MB2 .

Following a similar argument, we can write B1 in terms of B2. How dowe interpret the equation given here? To get from B1 to B2, one wouldstart with B1 and convert to the coordinate basis. This is represented byMB1B1M

−1B1

. This is followed by a transition from the coordinate basis to B2,which is represented by multiplying by the inverse of MB2 on the left and MB2

on the right. How would we construct a func using ChangeToCoordB andChangeFromCoordB to make the transition from B1 to B2, and vice-versa?

Matrices with Special Forms

As we have seen, the process of making a transition from one basis to anotheris quite involved. How does this process help us in working with lineartransformations? In this subsection and throughout the remainder of thischapter, we will consider examples that will help to show the importance ofthe ability to change bases.


Triangular matrices. Recall that if you have a system of m linear equa-tions in n unknowns, then you can interpret it as an equation L(x) = c,where L : Rn −→ Rm is a linear transformation, c is a vector in Rm, andx ∈ Rn. If A = (aij) is the matrix of L with respect to the coordinate basesand x = 〈x1, . . . , xn〉 , c = 〈c1, . . . , cm〉 are the representations of x, c in termsof their coefficients with respect to these bases, then the system of equationscan be represented as the matrix equation Ax = c.

Now, suppose that the matrix A is in lower triangular form, that is,aij = 0 for i < j. Then you can write the solution very quickly. The firstequation involves only x1, so you can solve it (provided a11 6= 0). The secondequation involves only x1, x2. Since you already know the solution of x1, youcan solve for x2. Following a similar approach, we can find the solution ofeach xi.

For example, the answer to Activity 6(b) is the following lower triangularmatrix 3 0 0

1 2 07 −4 3

.

You might not have found this matrix, but now that you know it, can youfind the basis that gives it? The system of equations that gives this matrixis

3x1 = c1

x1 + 2x2 = c2

7x1 − 4x2 + 3x3 = c3,

where c1, c2, c3 are given numbers.You can write the solution almost immediately as

x1 =c1

3

x2 =1

2(c2 − x1) =

3c2 − c1

6

x3 =1

3(c3 − 7x1 + 4x2) =

c3 − 3c1 + 2c2

3.

Of course, this is not the solution of the system of equations whose matrixof coefficients is the original matrix B2 given in Activity 6(b). Actually, the


x1, x2, x3 given here are the coefficients of that solution with respect to thebasis that transformedB2 into the above triangular matrix. In fact, that basisis B = [〈1, 1, 2〉 , 〈1, 2, 3〉 , 〈1, 2, 4〉]. Given this information and assuming thatthe right hand side of this system was given by c1 = 6, c2 = 2, c3 = 1, can youfind the solution of the original system? (Don’t forget the values of c1, c2, c3

are coefficients of a vector with respect to the basis B.)

You will note that we are not saying much about how to find a basis thattransforms a matrix into triangular form. One reason for this is that it is notvery practical. If we want to solve a system of equations, there are bettermethods, such as Gaussian elimination or inversion of the matrix of coeffi-cients (for which there are efficient computer methods). A more interestingcomment is a theoretical one. Suppose you have a linear transformationL : V −→ V and a basis B for V such that the matrix of L with respectto B is upper triangular. Then L(b1) is an element of the subspace of Vgenerated by b1. Also, L(b2) is an element of the subspace of V generatedby {b1,b2}. This means that every element of the subspace of V generatedby {b1,b2} is mapped by L into that same subspace generated by {b1,b2}.In other words, the subspace of V generated by {b1,b2} is invariant underthe linear transformation.

We can continue this process and say, for each k (up to the dimensionof V ), that the subspace generated by {b1,b2, . . . ,bk} is invariant under L.Such a decomposition of V into an increasing sequence of subspaces, eachinvariant under L, has theoretical importance in the general theory of vectorspaces.

Diagonal matrices. A linear transformation T : Rn −→ Rn whose ma-trix representation with respect to the coordinate basis is diagonal has aparticularly simple structure. It takes each vector ei and multiplies it by afixed scalar. For instance, if x = 〈2, 1, 3〉, and if T : R3 −→ R3 is a lineartransformation whose matrix representation with respect to the coordinatebasis is −3 0 0

0 4 00 0 5

,

then T (x) = 〈−3 · 2, 4 · 1, 5 · 3〉. In this case, the x component is scaled by afactor of −3, the y component is scaled by a factor of 4, and the z componentis scaled by a factor of 5. This means that the unit cube maps to the box B,


as illustrated in the figure below.

In R3, provide a picture of the unit cube and the resulting “box”found by applying T to the unit cube.

Figure 7.4: Losing a dimension

If the matrix representation of T with respect to the coordinate basisis not diagonal, then such a simple description of T is generally not possi-ble. Under certain circumstances, however, we can find a basis for whichthe matrix representation is diagonal and the notion of scaling makes sense.Consider the following example T : R2 −→ R2 defined by

T (〈x, y〉) =

(5 44 −1

)(x1

x2

).

As you can see, the matrix representation of T with respect to the coordinatebasis is not diagonal. However, in this case, we can find a basis for which thematrix representation of T is diagonal, namely

B =

{⟨2√5,

1√5

⟩,

⟨− 1√

5,

2√5

⟩}.

The matrix of T with respect to B is given by(7 00 −3

).

The vectors

⟨2√5,

1√5

⟩and

⟨− 1√

5,

2√5

⟩define a new coordinate system.

In R2, picture of coordinate axes with vectors from B super-imposed. Picture of unit square ABCO with respect to “new”coordinate system and picture of image rectangle A′B′C ′O. Ohere is the origin.

Figure 7.5: Effect of a diagonal basis

T is a scaling with respect to B, with a factor of 7 is the x′ direction and−3 in the y′ direction. As you can see, the square ABCO is mapped by Tto the rectangle A′B′C ′O.


The ability to diagonalize can also be used to simplify the equation of aconic section in which there is a “middle” term, say

3x2 + 2xy + 3y2 = 8.

The presence of the xy term makes this difficult to graph. In order to usediagonalization, we need to make use of an auxiliary concept called innerproduct. This is a deep and important concept in mathematics, but, fornow, you can consider the following formula as shorthand notation. If x =〈x1, . . . , xn〉 ,y = 〈y1, . . . , yn〉 ∈ Rn, then we can define the inner product〈x,y〉 by

〈x,y〉 =n∑i=1

xiyi.

We can write the algebraic expression 3x2 + 2xy + 3y2 as an inner product

〈A 〈x1, x2〉 , 〈x1, x2〉〉 ,

where

A =

(3 11 3

),

and 〈〉 represents the dot product on R2. Next, we change to the basis

B =

[⟨1√2,

1√2

⟩,

⟨1√2,− 1√

2

⟩]. The matrix B of the transformation rep-

resented by A is

B =

(4 00 2

).

If we let M be the matrix whose columns are the vectors b1 and b2, set〈y1, y2〉 = M−1 〈x1, x2〉, and make the substitution 〈x1, x2〉 = M 〈y1, y2〉 inthe original equation, then, after some simplification, we get

4y21 + 2y2

2 = 8.

Picture of ellipse with both sets of coordinate axes shown.

Figure 7.6: Rotation of coordinate axes

As Figure 7.6 shows, the original equation 3x2 + 2xy + 3y2 = 8 can berepresented by the equation 4(x′)2 +2(y′)2 = 8 in the x′y′ coordinate system,


where the x′-axis is the line that coincides with the basis vector

⟨1√2,

1√2

⟩,

and the y′-axis is the line that coincides with the basis vector

⟨1√2,

1√2

⟩.

Exercises

1. Let V = R3. Find the coordinate vector of 〈−2, 1, 3〉 in terms of thebasis [〈1, 1, 1〉 , 〈1, 1, 0〉 , 〈1, 0, 0〉].

2. Let V = R2. Find the coordinate vector of 〈6, 5〉 with respect to thebasis [〈1,−1〉 , 〈2, 1〉].

3. Let V = R4. Find the coordinate vector of 〈6,−3, 1, 2〉 with respect tothe basis [〈1,−1, 0, 2〉 , 〈2, 1,−1, 3〉 , 〈3, 0, 1, 0〉 , 〈1, 0, 0, 4〉].

4. Complete (a)–(b) for the bases

B1 = [〈1, 2〉 , 〈3, 0〉] and B2 = [〈2, 1〉 , 〈3, 2〉].

(a) If [x]B1 = 〈4,−3〉, find [x]B2 .

(b) Find the form of x in terms of the coordinate basis.

5. Complete (a)–(b) for the bases

B1 = [〈3,−1〉 , 〈1, 1〉] and B2 = [〈2, 3〉 , 〈4, 5〉].

(a) If [x]B1 = 〈−2, 5〉, find [x]B2 .

(b) Find the form of x in terms of the coordinate basis.

6. Let V = P2. Find the coordinate vector of p = 3−2x+x3 with respectto the basis [1, x2 − 2, x2 − x+ 1, x3 + x].

7. Let V = P2. Find the coordinate vector of p = 3x2 − 6x − 2 withrespect to the basis [x− 1, 2x+ 3, x2 + x].

8. Let P2(R) be the vector space of polynomials of degree two or less withreal coefficients. A basis C = [c0, c1, c2] is given by

c0 = 1

c1 = x

c2 = x2.


Another basis B = [b0,b1,b2] is given by

b0 =1

2x(x− 1)

b1 = −x2 + 1

b2 =1

2x(x+ 1).

For each of the following polynomials given below, write its coordinateswith respect to the basis C, and then change to its coordinates withrespect to the basis B. You can solve the problem by hand, or use thecomputer tools you built in the activities.

(a) p(x) = 3x2 − 5x+ 2

(b) q(x) = 7x2 − 4

(c) r(x) = 2x− 1

9. Let P2(R) be the vector space of polynomials of degree two or less withreal coefficients. A basis C = [c0, c1, c2] is given by

c0 = 1

c1 = x

c2 = x2.

Another basis B = [b0,b1,b2] is given by

b0 =1

2x(x− 1)

b1 = −x2 + 1

b2 =1

2x(x+ 1).

For each of the following polynomials, find its coordinates with respectto the basis B. You can solve the problem by hand, or use the computertools you built in the activities.

(a) p = 4c0 − 3c1 − 6c2

(b) q = 6c0 − 4c2


(c) r = 3c1

10. Justify each step of the calculation given in Theorem 7.1.1.

11. Formulate a theorem, similar to Theorem 7.1.1, that gives a formulafor changing a coordinate vector with respect to a basis B1 to its cor-responding coordinate vector with respect to a basis B2, where neitherB1 nor B2 is the coordinate basis. Once you have written a statementof this theorem, provide a proof.

12. In each of the following problems, a linear transformation L : Rn −→Rn is given by its matrix A with respect to the given basis B. Find itsmatrix with respect to the coordinate basis. You can solve the problemby hand, or use the computer tools you built in the activities.

(a)

A =

3 −1 4−1 3 00 2 3

B = [〈−2, 2,−2〉 , 〈3,−2,−3〉 , 〈2,−1,−1〉]

(b)

A =

1/2 1/8 1/4 1/80 1/2 0 1/2

1/4 1/2 0 1/40 1/3 0 2/3

B = [〈1, 1, 1, 0〉 , 〈1, 1, 0, 1〉 , 〈1, 0, 1, 1〉 , 〈0, 1, 1, 1〉]

13. In each of the following problems, a linear transformation L : Rn −→Rn is given by its matrix A with respect to the given basis B1. Findits matrix with respect to the basis B2. You can solve the problem byhand, or use the computer tools you built in the activities.

(a)

A =

3 −1 4−1 3 00 2 3

B1 = [〈−2, 2,−2〉 , 〈3,−2,−3〉 , 〈2,−1,−1〉]B2 = [〈2, 1, 0〉 , 〈1, 0, 2〉 , 〈1, 2, 1〉]


(b)

A =

1/2 1/8 1/4 1/80 1/2 0 1/2

1/4 1/2 0 1/40 1/3 0 2/3

B1 = [〈1, 1, 1, 0〉 , 〈1, 1, 0, 1〉 , 〈1, 0, 1, 1〉 , 〈0, 1, 1, 1〉]B2 = [〈1,−1, 0, 2〉 , 〈2, 1,−2, 0〉 , 〈1, 0,−2, 2〉 , 〈0, 2, 1,−1〉]

14. Define a transformation F : R3 −→ R3 by

F (〈x1, x2, x3〉) = 〈x1 − x2 − x3, 0, 2x1 + 3x3,−x2 + x3〉 .

Find the matrix representation of F with respect to the basis

d = [〈1, 2, 1〉 , 〈2,−1, 1〉 , 〈3, 1, 2〉].

15. Suppose that T : R3 −→ R3 is a linear transformation whose matrixrepresentation with respect to some basis B1 is given by 3 −1 2

4 −5 6−1 3 −7

.

Suppose that the transition matrix from B1 to another basis B2 is givenby 2 1 1

1 0 10 2 3

.

Find the expression for the rule of correspondence of T in terms of thecoordinate basis.

16. Let P3(R) be the vector space of polynomials of degree three or lesswith real coefficients. A basis C = [c0, c1, c2, 〈c3〉] is given by

c0 = 1

c1 = x

c2 = x2

c3 = x3.


This is called the basis of monomials.

Another basis B = [b0,b1,b2,b3] is given by

b0 = 1

b1 = x

b2 = x(x− 1)

b3 = x(x− 1)(x− 2).

This will be called the basis of linear products.

Let L : P3(R) −→ P3(R) be the linear transformation, called thedifference operator, given by

L(p)(x) = p(x+ 1)− p(x).

Write the matrix of L with respect to the basis C, and then find itsmatrix with respect to the basis B. You can solve the problem byhand, or use the computer tools you built in the activities.

17. Define L : P3(R) −→ P3(R) to be the linear transformation defined bythe derivative, that is, L(p) = p′. Write the matrix of L with respectto the monomial basis (see previous exercise), and then find its matrixwith respect to the basis of linear products (see previous exercise).You can solve the problem by hand, or use the computer tools youbuilt in the activities. In this exercise, we think of each polynomial asan expression for a function from R to R.

18. Why do you think that we have been using sequences of basis vectorsrather than sets of basis vectors. If the order of the vectors in a basis ischanged, would the coordinate vector change? To help you in answeringthis question, construct a basis in R3. Select a vector x. Find thecoordinate vector of x with respect to the basis you have constructed.Change the order of the basis elements, and find the coordinate vectorwith respect to this new ordering. What do you observe?

19. Let F : Rn −→ Rn be a linear transformation. Formulate a theorem,similar to Theorem 7.1.2, that gives a formula for changing the matrixrepresentation of F with respect to a basis B1 to its correspondingmatrix representation with respect to a basis B2, where neither B1 norB2 is the coordinate basis. Once you have written a statement of thistheorem, provide a proof.

357

7.2 Eigenvalues and Eigenvectors

Activities

1. In each of the following problems, (a)–(e), find as many solutions tothe given equation as you can. Let x ∈ R2.

(a) For the matrix A1 =

(3 12 2

), find all solutions to A1x = x.

(b) For the matrix A2 =

(5 −27 −3


(c) For the matrix A3 =

(2 −12 2


(d) Let λ, the Greek letter lambda, represent a scalar. For the matrix

A4 =

(1 32 2

), find all λ such that the equation A4x = λx has at

least one solution. For each such λ, find all solutions x that satisfythe given equation.

(e) Let λ, the Greek letter lambda, represent a scalar. For the matrix

A5 =

(5 −22 1

), find all λ such that the equation A5x = λx has

at least one solution. For each such λ, find all solutions x thatsatisfy the given equation.

2. Complete (a)–(d) for the matrix A4 defined in Activity 1(d).

(a) Write the polynomial p given by p(λ) = det(A4 − λI), where I isthe 4× 4 identity matrix.

(b) Find all λ such that p(λ) = 0. For each solution, select a nonzerovector x such that A4x = λx.

(c) What can you say about the vectors you picked in the previousstep? Do they form a basis for R2?

(d) Think of A4 as representing the expression of a linear transforma-tion T : R2 −→ R2 given by

T (x) = A4x.


Use the information you have gathered thus far to find a diagonalmatrix representation for T .

3. Repeat Activity 2 for the matrix given by A6 =

0 3 −32 2 −2−4 −1 1

.

4. Repeat Activity 2 for the matrix given by A7 =

0 0 0−4 1 −13 2 −1

.

Did anything different happen this time? Can you diagonalize A7? Tryto explain as much as you can.

5. Let S be the matrix whose columns are the vectors you found in Ac-tivity 3. Set D = S−1A6S. A6 denotes the matrix from Activity 3. Doyou see anything remarkable about D? If so, can you explain it?

Discussion

Basic Ideas

Given a linear transformation L : V −→ V , the point of Activity 1 was toillustrate the idea that it is possible to find scalars λ and nonzero vectors xsuch that

L(x) = λx.

When this occurs, we say that λ is an eigenvalue of L and x is an eigenvectorbelonging to λ. Formally, we have

Definition 7.2.1. Let L : V −→ V be a linear transformation. If thereexists a nonzero vector x ∈ V for which L(x) = λx, then we say that λ is aneigenvalue of L. Any nonzero vector x satisfying the equality for a particularλ is called an eigenvector belonging to λ.

What examples of eigenvalues and eigenvectors did you find in the activ-ities? Carefully describe them before proceeding.

What effect does a linear transformation have upon an eigenvector? Weknow that a linear transformation takes a vector and transforms it into an-other vector. If L : Rn −→ Rn is a linear transformation, and if x ∈ Rn is

7.2 Eigenvalues and Eigenvectors 359

an eigenvector belonging to λ, how might we describe the way L transformsx? According to the definition, the effect of L on an eigenvector, no matterhow complex the transformation, is nothing more than a simple scaling ofthe eigenvector. In this context, what does the term “scaling” mean?

In R3, give an arrow for a vector x. Give a second, longer arrowthat coincides with x that will be labeled L(x).

Figure 7.7: Image of an eigenvector

The activities introduced a methodology for finding eigenvalues and eigen-vectors. The procedure outlined in Activity 2 can be expanded and gen-eralized. In order to identify the eigenvalues of a linear transformationL : Kn −→ Kn, we define a second transformation based upon the equa-tion L(x) = λx. Define I : Kn −→ Kn to be the identify transformation:I(x) = x for all x ∈ Kn. Define M : Kn −→ Kn by the expression

M(x) = L(x)− λI(x).

Is M a linear transformation? Before continuing, you should check this.What is the relationship between the expression for M and the equationL(x) = λx? A particular scalar, say s, is a solution to L(x) = λx if and onlyif there exists a nonzero vector xs ∈ Kn such that xs ∈ ker(M). Can youexplain why this is the case?

If the kernel of M were to contain the zero vector exclusively, thendim(ker(M)) = 0. According to Theorem 5.2.5, the Rank and Nullity The-orem, rank(M) = n. As a result, the rank of any matrix representationof M would be n. In such a case, Theorems 6.3.4 and 6.4.4 tells us thatdet(M) 6= 0. Hence, a scalar s is an eigenvalue if and only if

|[L]− sI| = 0,

where [L] denotes a matrix representation of L. The set of eigenvalues of Lis subsequently given by the set

{λ : |[L]− λI| = 0} .

The determinant |[L]− λI|, a polynomial in λ, is called the characteristicpolynomial of L. Any root λ of this polynomial is an eigenvalue of L. It isuseful to summarize all of this in a theorem.


Theorem 7.2.1. Let L : Kn −→ Kn be a linear transformation, [L] a matrixrepresentation of L, and let p be the function given by p(λ) = det([L] −λI). Then, p is a polynomial of degree n in λ with coefficients in K. Theeigenvalues of L are precisely the roots of p.

As you may recall from your study of algebra, a polynomial of degree nhas at most n roots or zeros. This means that a linear transformation fromRn into itself has at most n eigenvalues. If we are working over the complexnumbers C, we know that a polynomial can be factored completely. Hence,if L is a linear transformation from Cn to itself, and if we count eigenvaluesaccording to their multiplicity, then L would have exactly n eigenvalues.

Bases of Eigenvectors

The steps in Activity 2 provided a rough sketch of the procedure for finding adiagonal matrix representation. In Activity 5, you found a diagonal form bymultiplying by a suitable transition matrix. These activities raise some im-portant questions. Is a set of eigenvectors always linearly independent? Dodiagonal matrix representations correspond to bases consisting exclusively ofeigenvectors? What happened in Activity 4 that prevented the transforma-tion defined by A7 from having a diagonal representation? Providing answersto these questions will be one focus of our work in this and the next section.We provide a partial answer to the first question in the following theorem.

Theorem 7.2.2. Let L : V −→ V be a linear transformation. Let λ1, λ2,. . ., λk be a set of distinct eigenvalues of L. For each i = 1, 2, . . . , k, let vi

be an eigenvector belonging to λi. Then, the set of vectors

{vi : i = 1, 2, . . . , k}


Proof. The proof will be by induction on k.Since an eigenvector, by definition, is not zero, the theorem is true for

k = 1.Suppose that the theorem holds for any set of k eigenvectors,

v1,v2, . . . ,vk,

belonging to a setλ1, λ2, . . . , λk


of k distinct eigenvalues. Now, suppose we have a set of k + 1 distincteigenvalues,

λ1, λ2, . . . , λk, λk+1,

and a corresponding set of eigenvectors,

v1,v2, . . . ,vk,vk+1,

where vi belongs to λi. By the induction hypothesis, the first k of these arelinearly independent. In order to show that the entire set is independent, allwe have to show is that vk+1 is not a linear combination of v1,v2, . . . ,vk.Suppose this is not the case; that is, vk+1 =

∑ki=1 tivi, where the ti are

scalars. Then we would have,

k∑i=1

tiλk+1vi = λk+1

k∑i=1

tivi = λk+1vk+1 = T (vk+1)

= T

(k∑i=1

tivi

)=

k∑i=1

tiT (vi) =k∑i=1

tiλivi.

But then we would have,

0 =k∑i=1

tiλk+1vi −k∑i=1

tiλivi =k∑i=1

ti (λk+1 − λi) vi.

Since all of the eigenvalues λ1, λ2, . . . , λk+1 are distinct, λk+1 − λi 6= 0,i = 1, 2, . . . , k. Since the vectors v1, v2, . . . , vk are linearly independent,ti = 0, i = 1, 2, . . . , k. This implies that vk+1 = 0, which is not the case.

This theorem explains why the matrix S in Activity 5 is invertible. Thecolumns of S were the eigenvectors belonging to different eigenvalues. Indeed,if a vector space V has dimension n and n distinct eigenvalues, then a setof n eigenvectors, each corresponding to a distinct eigenvalue, constitutes abasis. This situation can be generalized. Specifically, we can construct abasis of eigenvectors in which some belong to the same eigenvalue. The nexttheorem will help us to prove such a theorem.

Theorem 7.2.3. The set of all eigenvectors belonging to the same eigenvalueforms a subspace.



We will use the theorem just stated, along with Theorem 7.2.2, to provea stronger version of Theorem 7.2.2. We will show that it is possible toconstruct a linearly independent set of eigenvectors in which some of thevectors belong to the same eigenvalue. This theorem is an important step inthe process of determining conditions that guarantee diagonalizability, thatis, the existence of a basis for which a linear transformation has a diagonalmatrix representation.

Theorem 7.2.4. Let L : V −→ V be a linear transformation, and let λ1, λ2,. . ., λk be a set of distinct eigenvalues of L. Let E be a set of eigenvectors thatsatisfies the following condition: for each i = 1, 2, . . . , k, those eigenvectorsthat belong to λi are linearly independent. It then follows that the entire setE is linearly independent.

Proof. Suppose that

E = {v1,v2, . . . ,vm}, m ≤ n.

Let a1, a2, . . . , am be scalars such that

a1v1 + a2v2 + · · ·+ amvm = 0.

It suffices to show that

a1 = a2 = · · · = am = 0.

Group the combination according to those vectors which belong to the sameeigenvalue. In such a case, it follows, by Theorem 7.2.3, that such a sumyields another eigenvector belonging to the same eigenvalue. If we do thisfor every such grouping, the original linear combination

a1v1 + a2v2 + · · ·+ amvm

simplifies to a sum of distinct eigenvectors, each belonging to a differenteigenvalue. By Theorem 7.2.2, each vector must be zero. This means thateach part of the original combination that belonged to a particular eigenvalueyields a vector sum of zero. Since we are assuming that those vectors in Ethat belong to the same eigenvalue are linearly independent, it follows thatthe associated scalars are zero. Since this happens for each such grouping ofthe original linear combination above, it follows that the coefficients a1, a2,. . . , am are all simultaneously zero.


The next two definitions will help us to state a condition that guaranteesthe existence of a basis of eigenvectors. The proof will be based upon The-orems 7.2.2, 7.2.3, and 7.2.4. This theorem will be key tool in helping us todevise a procedure for diagonalization. This procedure will be discussed indetail in the next section.

Definition 7.2.2. The algebraic multiplicity of an eigenvalue of a lineartransformation L : Kn −→ Kn is its multiplicity as a root of the characteristicpolynomial of L.

Definition 7.2.3. The geometric multiplicity of an eigenvalue λ is the di-mension of the subspace of eigenvectors that belong to it.

Theorem 7.2.5. If L : Kn −→ Kn is a linear transformation, and if thesum of the geometric multiplicities of the eigenvalues of L is n, then there isa basis of eigenvectors of L.

Proof. For each eigenvalue, find a basis for the subspace of its eigenvectors(it is a subspace by Theorem 7.2.3). Then, by Theorem 7.2.4, the unionof all of these bases is linearly independent. By the assumption regardinggeometric multiplicities, the number of vectors in this linearly independentset is n. By Theorem 4.4.8, this set is a basis.

Theorem 7.2.5, which gives a condition for diagonalizability, allows us toexpand the level of detail of the procedure given in Activity 2. In each step,assume that L : Rn −→ Rn is a linear transformation.

1. Find the matrix representation [L] of L with respect to the coordinatebasis. Determine the characteristic polynomial |[L]− λI|.

2. Find the roots of the characteristic polynomial.

3. Find a basis for the subspace corresponding to each eigenvalue.

4. Take the union of these bases.

5. If the sum of the geometric multiplicities of the eigenvalues of L isequal to n, then the linearly independent set described in 4. forms abasis. Can you explain why? What happens if the sum of the geometricmultiplicities is not equal to n? Would L be diagonalizable in this case?


This procedure leaves several unanswered questions: Can you find the rootsof the characteristic equation? How do we construct a basis for a giveneigenspace (the subspace of eigenvectors corresponding to a particular eigen-value)? What happens if the sum of the geometric multiplicities is not equalto the dimension of the vector space? What is the precise relationship be-tween an “eigenbasis”, if one can be constructed, and the coordinate basis?These and related questions will be answered in the next section.

What Can Happen?

Before considering conditions guaranteeing diagonalizability, as well as otherapplications of the theory of eigenvalues and eigenvectors, it may be useful tosummarize all of the various possibilities encountered thus far. If L : Kn −→Kn is a linear transformation, where K is some field, the following list detailsimportant facts concerning eigenvalues and eigenvectors.

• If K is the set C of complex numbers, then it is certain that L willhave at least one eigenvalue. In general, however, it is possible that Lhas no eigenvalues. When might L not have any eigenvalues?

• L has at most n eigenvalues. Can you explain why?

• If there is a basis for Kn consisting of eigenvectors, then the matrix ofL with respect that basis is diagonal. We actually showed that it mightbe possible to construct a basis consisting exclusively of eigenvectors.However, we have not yet proven that a basis consisting of eigenvectorsyields a diagonal matrix representation. Can you prove that?

• If L has n distinct eigenvalues, then there is a basis for Kn consistingof eigenvectors.

• If the sum of the geometric multiplicities of the eigenvalues of L is n,then there is a basis for K consisting of eigenvectors.

In considering these possibilities, we must be careful to take into accountthe base field. For example, the characteristic polynomial p of the matrix A7

in Activity 4 is given by

p(λ) = −λ(1 + λ2).


One might be tempted to conclude that 0 is the only eigenvalue. If K = R, aswas the case in Activity 4, then 0 is the only eigenvalue, and a transformationT : R3 −→ R3 defined by T (x) = A7x cannot be diagonalized. On the otherhand, if we take K to be the complex numbers, then a linear transformationT : C3 −→ C3 defined by A7 has 3 distinct eigenvalues. In this case, T canbe diagonalized.

Exercises

In doing the following problems, you may use any computer software, in-cluding any constructions you made in the activities, or a computer algebrasystem such as Derive, Maple, Matlab, or Mathematica.

1. Consider each of the following matrices as representing a linear trans-formation on a vector space whose field of scalars is R. For each matrix,determine the characteristic polynomial, the eigenvalues, and the cor-responding eigenspaces.

(a)

A1 =

(−5 4−8 7

)(b)

A2 =

(2 1−3 −1

)(c)

A3 =

(−7 −615 12

)(d)

A4 =

4 0 22 3 2−3 0 −1

(e)

A5 =

4 1 20 3 −20 0 −1


(f)

A6 =

9 −7 73 −1 3−5 5 −3

(g)

A7 =

3 1 00 1 04 2 1

(h)

A8 =

2 3 66 2 −33 −6 2

2. Repeat Exercise 1 for the following matrices, except this time assume

that the field of scalars is the complex numbers C.

(a)

A1 =

(1 −31 1

)(b)

A2 =

2 3 66 2 −33 −6 2

3. Suppose F : R2 −→ R2 is a linear transformation whose matrix repre-

sentation with respect to the ordered basis B = [〈1, 1〉 , 〈2,−1〉] is(5 00 −1

).

(a) Show that B is a basis of eigenvectors.

(b) Find the matrix representation of F with respect to the coordinatebasis.

(c) Find the characteristic polynomial of F .

4. Suppose that p(x) = (x − 3)2(x − 2)(x + 1) is the characteristic poly-nomial of a linear transformation T : R4 −→ R4. Is T diagonalizable?Why, or why not?


5. Suppose that p(λ) = λ2 − 5λ + 6 is the characteristic polynomial of alinear transformation G : R2 −→ R2.

(a) Construct a basis for R2 of eigenvectors of G.

(b) Find the matrix representation of G with respect to the eigenbasisyou constructed in (a).

(c) Find the matrix representation of G with respect to the orderedbasis B = [〈1, 3〉 , 〈−4, 2〉].

6. Let L : Rn −→ Rn be a linear transformation. Let I : R −→ Rn bethe identity transformation. Show that the transformation M : Rn −→Rn, defined by

M(x) = L(x)− λI(x)

is a linear transformation.

7. Let L, I, and M be defined as they were in Exercise 6. Show that λ isan eigenvalue of L if and only if there exists a nonzero vector x suchthat x ∈ ker(M).

8. Let L, I, and M be defined as they have been in the previous twoexercises. Carefully explain why the set of eigenvalues of L is thesolution set of the equation

det([M ]) = 0,

where [M ] denotes the matrix representation of M with respect to thecoordinate basis.

9. Explain why the matrix S defined in Activity 5 is invertible.


11. In Theorem 7.2.4, the original linear combination

a1v1 + a2v2 + · · ·+ amvm

simplifies to a sum of distinct eigenvectors, each belonging to a differenteigenvalue. If you recall, we were trying to show that the scalars ai aresimultaneously zero. Hence, we were considering the “simplified sum”when set equal to the zero vector. In this context, use Theorem 7.2.2to explain why each of the vectors in the simplified combination mustbe the zero vector.


12. Let A be a matrix representation of a linear transformation on a vectorspace over K with respect to some basis B. If A is diagonal, prove thatthe basis B consists entirely of eigenvectors, and show that the diagonalelements of A are the eigenvalues.

13. Let

A =

(0 −11 0

)be the matrix representation of a linear transformation T : R2 −→ R2

with respect to the coordinate basis.

(a) Find the characteristic polynomial of T .

(b) Show that T has no eigenvalues.

(c) Interpret this result geometrically. In particular, what does itmean to say that T has no eigenvalues.

(d) If the vector space were Cn instead of Rn, would T still have noeigenvalues? Explain.

14. Consider the linear transformation T : R3 −→ R3 whose matrix withrespect to the coordinate basis is3 1 1

0 3 10 0 5

.

Try to diagonalize this matrix. What is the relationship between thealgebraic and geometric multiplicities of an eigenvalue in this case?

15. If A is a square matrix, then A2 denotes A ·A. Similarly, A3 = A ·A ·A,and An = A ·A · · ·A, a product of n copies of A. If a square matrix Ais diagonalizable, and if all of its eigenvalues are either 1 or −1, thenshow that A2 = I.

16. If A is square, diagonalizable matrix, and if all of its eigenvalues areeither 1 or 0, then prove A2 = A.

17. If A is a square, diagonalizable matrix, and if all of its eigenvalues areeither 3 or −5, then show that A2 + 2A− 15 = 0.


18. Can you think of a general statement for which the three previousexercises are special cases?

19. Show that the general formula for the characteristic polynomial of a2× 2 matrix (

a11 a12

a21 a22

)is given by λ2− (tr(A))λ+ det(A), where tr(A) denotes the trace of A,the sum of the diagonal entries.

20. Suppose v is a nonzero eigenvector of a matrix A belonging to theeigenvalue λ. Show that v is an eigenvector of An belonging to λn.

370

7.3 Diagonalization and Applications

Activities

1. Let T : R2 −→ R2 be defined by

T (〈x1, x2〉) = A ·(x1

x2

),

where

A =

(7 −103 −4

).

(a) Verify that the matrix representation of T with respect to thecoordinate basis is given by A. Find the characteristic polynomialof T with respect to the coordinate basis, that is, compute |A −λI2|.

(b) Let B = {〈2, 1〉 , 〈−3, 4〉} be another basis for R2. Apply the func

ChangeMatRep that you constructed in Activity 4 of Section 7.2to find [T ]B, the matrix representation of T with respect to thebasis B. Find the characteristic polynomial of T with respect tothe basis B, that is, compute |[T ]B − λI2|.

(c) Use Theorem 7.1.2 to show that A and [T ]B are similar matrices.What is the relationship between |A− λI2| and |[T ]B − λI2|?

(d) Based upon the results you obtained in part (c), what can we sayabout the characteristic polynomials of two similar matrices? Ifasked to find the eigenvalues of a linear transformation, does itappear to matter what basis we work with? Explain your answer.

2. Define F : R3 −→ R3 by

F (〈x1, x2, x3〉) = 〈4x1 + x3, 2x1 + 3x2 + 2x3, x1 + 4x3〉 .

(a) Find the eigenvalues of F . Does the characteristic polynomialfactor completely?

(b) Suppose λ = a is an eigenvalue of F . According to Theorem 7.2.3,the eigenspace corresponding to a forms a subspace of R3. x ∈ R3

7.3 Diagonalization and Applications 371

is an eigenvector corresponding to a if and only if x is a solution of[F ]C − aI3 = 0. Why is this the case? How can we use this equa-tion to find a basis for the eigenspace corresponding to a? Onceyou have answered these questions, find a basis for the eigenspacecorresponding to each of the eigenvalues you found in part (a).

(c) Let E be the collection of all of the eigenbasis vectors you foundin part (b). According to Theorem 7.2.4, this set is linearly inde-pendent. Does this set form a basis for R3 in this case? Why, orwhy not?

(d) If E forms a basis for R3, what is [F ]E , the matrix representationwith respect to E? What is the form of the transition matrix fromthe coordinate basis C to the basis E? If E does not form a basis,would it still be possible to construct a basis of eigenvectors?

(e) What is the relationship between the algebraic multiplicity of eacheigenvalue and its geometric multiplicity? Do the geometric mul-tiplicities add to the dimension of R3?

3. Define G : R3 −→ R3 by

G(〈x1, x2, x3〉) = 〈x3, x1 − x3, x2 + x3〉 .

(a) Find the eigenvalues of G. Does the characteristic polynomialfactor completely?

(b) Find a basis for the eigenspace corresponding to each of the eigen-values you found in part (a).

(c) Let E be the collection of all of the eigenbasis vectors you foundin part (b). Does this set form a basis for R3 in this case? Why,or why not?

(d) If E forms a basis for R3, what is [G]E , the matrix representationwith respect to E? What is the form of the transition matrix fromthe coordinate basis C to the basis E? If E does not form a basis,would it still be possible to construct a basis of eigenvectors?

(e) What is the relationship between the algebraic multiplicity of eacheigenvalue and its geometric multiplicity? Do the geometric mul-tiplicities add to the dimension of R3?


4. Let P2(R) be the space of all polynomials with real coefficients of degree2 or less. Define H : P2(R) −→ P2(R) by

H(p) = p′,

where p ∈ P2(R), and p′ denotes the derivative of p. (Note that herewe are considering polynomials as functions, rather than as a sequenceof coefficients from a field. Do you see this distinction?)

(a) Find the matrix representation of H with respect to the basisB = {1, x, x2}. Find the eigenvalues of H. Does the characteristicpolynomial factor completely?

(b) Find a basis for the eigenspace corresponding to each of the eigen-values you found in part (a).

(c) Let E be the collection of all of the eigenbasis vectors you found inpart (b). Does this set form a basis for P2(R) in this case? Why,or why not?

(d) If E forms a basis for P2(R), what is [H]E , the matrix representa-tion with respect to E? What is the form of the transition matrixfrom basis B to the basis E? If E does not form a basis, would itstill be possible to construct a basis of eigenvectors?

(e) What is the relationship between the algebraic multiplicity of eacheigenvalue and its geometric multiplicity? Do the geometric mul-tiplicities add to the dimension of P2(R)?

(f) On the basis of your findings here and in Activities 2 and 3, underwhat condition is a linear transformation likely to have a diagonalmatrix representation?

5. Let

A =

(−4 −63 5

),

and let A = C−1DC, where

C =

(−1 −11 2

)and D =

(−1 00 2

),

determine whether An = C−1DnC for n = 2, 3, 4. What do you ob-serve? Based upon your observations, state a conjecture, if possible.


Discussion

Relationship between Diagonalizability and Eigenvalues

Toward the end of the last section, we outlined a procedure for diagonalizinga matrix. In this section, we will provide additional detail, as well as deter-mining conditions which ensure diagonalizability. But, before we continue,it might be helpful to define exactly what we mean by diagonalizability.

Definition 7.3.1. Let T : Rn −→ Rn be a linear transformation. T isdiagonalizable if there exists a basis B such that the corresponding matrixrepresentation [T ]B is a diagonal matrix.

Activities 2, 3, and 4, together with the discussion in the last section,suggest that the diagonalizability of a linear transformation is dependentupon the ability of finding, or constructing, a basis consisting exclusively ofeigenvectors. The theorem below verifies that this is indeed the case.

Theorem 7.3.1. Let T : Rn −→ Rn be a linear transformation, and let B bea basis. The matrix representation [T ]B is diagonal if and only if B consistsexclusively of eigenvectors.

Proof. (⇐=:) Assume that B is a basis of eigenvectors. We want to showthat [T ]B is a diagonal matrix. The proof of this part is left to the exercises.See Exercise 6.

(=⇒:) Assume that [T ]B is a diagonal matrix. Then, there exist scalarsa11, a22, . . . , ann such that

[T ]B =

a11 0 0 . . . 00 a22 0 . . . 00 0 a33 . . . 0...

......

......

0 0 0 . . . ann

.

Let the basis B be given by B = {v1,v2, . . . ,vn}. Then, according to Defi-nition 6.2.2,

T (vi) = 0v1 + · · ·+ 0vi−1 + aiivi + 0vi+1 + · · ·+ 0vn

= aiivi.


According to Definition 7.2.1, vi, i = 1, 2, . . . , n, is an eigenvector. Therefore,B is a basis consisting entirely of eigenvectors.

In Section 7.3, you found eigenvalues by working with the coordinate ba-sis. The results of Activity 1 suggest that the characteristic polynomial doesnot depend upon the specific choice of basis. The next theorem establishesthis as a general result.

Theorem 7.3.2. Let T : Rn −→ Rn be a linear transformation. Let B∞and B∈ be two bases for Rn. Then,

|[T ]B∞ − λIn| = |[T ]B∈ − λIn|,

that is, the characteristic polynomial is independent of the choice of basis.

Proof. As given in the statement of the theorem, let [T ]B∞ and [T ]B∈ betwo matrix representations of T with respect to the bases B∞ and B∈,respectively. According to Theorem 7.1.2, these two matrices are similar,that is, there exists an invertible matrix C such that

[T ]B∈ = C−1[T ]B∞C.

What are the entries of C? Can you recall based upon the theorem justcited?

Using C, we can establish the following equality,

|[T ]B∈ − λIn| = |C−1[T ]B∞C − λIn|= |C−1([T ]B∞ − λIn)C|= |C−1||[T ]B∞ − λIn||C|= |[T ]B∞ − λIn||C−1C|= |[T ]B∞ − λIn|,

which is what we wished to prove. Can you justify each step?

These theorems simplify the basic procedure for diagonalizing a trans-formation. The theorem we have just proven tells us that we can use anybasis, and hence, any matrix representation, to find the eigenvalues of alinear transformation. Theorem 7.3.1 reveals that diagonalizability dependsentirely upon the ability to construct a basis of eigenvectors. What remainsis to find conditions that guarantee the existence of an eigenbasis.


Conditions that Guarantee Diagonalizability

In Activities 2, 3, and 4, you were asked to compare the geometric and al-gebraic multiplicities of each eigenvalue, as well as to determine whether thecharacteristic polynomial splits, that is, factors completely. Based upon yourresults, is it possible for a diagonalizable transformation to have a characteris-tic polynomial that does not split? If the characteristic polynomial splits, canyou immediately conclude that the transformation is diagonalizable? Doesthe relationship between the algebraic and geometric multiplicities of eacheigenvalue appear to have any bearing upon the issue of diagonalizability?The next theorem provides an answer to the first question.

Theorem 7.3.3. Let T : Rn −→ Rn be a linear transformation that isdiagonalizable. Then, the characteristic polynomial of T splits.

Proof. According to the assumption, there exists a basis B such that theresulting matrix representation [T ]B is a diagonal matrix. By Theorem 7.3.2,the choice of basis does not effect the form of the characteristic polynomial.Hence, we can find the characteristic polynomial of T by evaluating thedeterminant |[T ]B − λIn|. Since the only nonzero entries of [T ]B − λIn liealong the diagonal and are of the form aii − λ, i = 1, 2, . . . , n, it follows thatthe characteristic polynomial |[T ]B−λIn| will consist exclusively of a productof n factors of the form (aii − λ), i = 1, 2, . . . , n.

Does this theorem answer the second question posed in the first paragraphof this subsection? Why, or why not?

Activities 2, 3, and 4 reveal a second consequence of diagonalizability,the equality of the algebraic and geometric multiplicities of each eigenvalue.Before we can prove this result, we first show that the geometric multiplicityof an eigenvalue cannot exceed its algebraic multiplicity.

Theorem 7.3.4. Let T : Rn −→ Rn be a linear transformation. Let λ bean eigenvalue of T . Then, the geometric multiplicity of λ does not exceed itsalgebraic multiplicity.

Proof. Let λ be an eigenvalue of T having algebraic multiplicitym. Certainly,m ≤ n. Why is this true? Let {v1, . . . ,vp} be a basis for the eigenspacecorresponding to λ. Then, p ≤ n. Why? According to Theorem 4.4.10, wecan expand this linearly independent set to a basis for all of Kn, say

{v1, . . . ,vp,vp+1, . . . ,vn}.


Since the first p vectors are eigenvectors, the matrix representation of T withrespect to this basis is of the form

λ 0 0 . . . 0 a1p+1 . . . a1n

0 λ 0 . . . 0 a2p+1 . . . a2n

0 0 λ . . . 0 a3p+1 . . . a3n...

......

......

......

...0 0 0 . . . λ app+1 . . . apn0 0 0 . . . 0 ap+1p+1 . . . ap+1n

0 0 0 . . . 0 ap+2p+1 . . . ap+2n...

......

......

......

...0 0 0 . . . 0 anp+1 . . . ann

.

According to Theorem 7.2.1, the characteristic polynomial is the determinantof the matrix

λ 0 0 . . . 0 a1p+1 . . . a1n

0 λ 0 . . . 0 a2p+1 . . . a2n

0 0 λ . . . 0 a3p+1 . . . a3n...

......

......

......

...0 0 0 . . . λ app+1 . . . apn0 0 0 . . . 0 ap+1p+1 . . . ap+1n

0 0 0 . . . 0 ap+2p+1 . . . ap+2n...

......

......

......

...0 0 0 . . . 0 anp+1 . . . ann

− tIn

=

λ− t 0 0 . . . 0 a1p+1 . . . a1n

0 λ− t 0 . . . 0 a2p+1 . . . a2n

0 0 λ− t . . . 0 a3p+1 . . . a3n...

......

......

......

...0 0 0 . . . λ− t app+1 . . . apn0 0 0 . . . 0 ap+1p+1 − t . . . ap+1n

0 0 0 . . . 0 ap+2p+1 . . . ap+2n...

......

......

......

...0 0 0 . . . 0 anp+1 . . . ann − t

.

If we apply what we have learned about determinants from Chapter 6, wecan see that the determinant of the matrix given above will simplify to an


n degree polynomial with a factor of the form (λ − t)p. Since the algebraicmultiplicity is assumed to be m, it follows that p ≤ m, that is, the geometricmultiplicity cannot exceed the algebraic multiplicity.

We will use this theorem to prove the following theorem, which shows thatthe equality of the algebraic and geometric multiplicities of each eigenvalueis a second consequence of diagonalizability.

Theorem 7.3.5. Let T : Rn −→ Rn be a linear transformation that is diago-nalizable. Then, the geometric and algebraic multiplicities of each eigenvalueare equal.

Proof. Suppose that λ1, λ2, . . . , λk, k ≤ n, are the distinct eigenvaluesof T . By Theorem 7.3.1, there exists a basis B consisting exclusively ofeigenvectors. Since dim(Rn) = n, there are n vectors in the set B. Let Ei,i = 1, 2, . . . , k, each be a set of those vectors in B that correspond to theeigenvalue λi. Let ji, i = 1, 2, . . . , k, represent the number of vectors in Ei.Let mi, i = 1, 2, . . . , k, denote the algebraic multiplicity of λi.

Since Ei, i = 1, 2, . . . , k, is a subset of B, each Ei is a linearly independentset. This set also generates the eigenspace corresponding to λi. To beginwith, the set is linearly independent. In addition, any vector in the eigenspaceof λi can be written as a linear combination of B, from which it follows thatany such vector can be written as a linear combination of Ei. (Can you fillin the details here?) Hence, Ei forms a basis for the eigenspace of λi, whichmeans that ji represents the geometric multiplicity of λi.

By Theorem 7.3.4, ji ≤ mi for all i = 1, 2, . . . , k. We can use this to say

n =k∑i=1

ji ≤k∑i=1

mi ≤ n,

from which it follows that

k∑i=1

(mi − ji) = 0.

Since mi − ji ≥ 0 for all i = 1, 2, . . . , k, we can conclude that

ji = mi

for all i = 1, 2, . . . , k, which is what we wished to prove.


As a result of Theorems 7.3.3 and 7.3.5, we know that the splitting of thecharacteristic polynomial and the equality of the algebraic and geometricmultiplicities of each eigenvalue are consequences are diagonalizability. Canwe go the “other way”? As the activities show, neither of these conditions inisolation is sufficient to ensure diagonalizability. What do we mean by suffi-cient here? Of the linear transformations in Activities 2, 3, and 4, only oneproved to be diagonalizable. In this case, both the characteristic polynomialsplit and the algebraic and geometric multiplicities were equal. As the nexttheorem shows, both of these things must occur together in order to ensurethe existence of an eigenbasis.

Theorem 7.3.6. Let T : Rn −→ Rn be a linear transformation. If thecharacteristic polynomial of T splits, and if the algebraic and geometric mul-tiplicities of each eigenvalue of T are equal, then T is a diagonalizable trans-formation.

Proof. Let λ1, λ2, . . ., λk be the distinct eigenvalues of T . Let mi, i =1, 2, . . . , k, represent the algebraic multiplicities of each eigenvalue. Since thecharacteristic polynomial splits,

m1 +m2 + · · ·+mk = n,

that is, the algebraic multiplicities add to the dimension of the vector spaceRn.

Let ji, i = 1, 2, . . . , k, denote the geometric multiplicity of each eigenspaceof λi, i = 1, 2, . . . , k. Let Ei, i = 1, 2, . . . , k, be a basis for the eigenspacecorresponding to λi. If we let

B = E1 ∪ E2 ∪ · · · ∪ Ek,

that is, B is the collection of all eigenbasis vectors from each Ei, then B,according to Theorem 7.2.4, is a linearly independent set. By assumption,ji = mi for all i = 1, 2, . . . , k. Therefore, B is a linearly independent set of neigenvectors. According to Theorem 4.4.8, B forms an eigenbasis for Rn. ByTheorem 7.3.1, it follows that the matrix representation [T ]B is diagonal.

Theorems 7.3.3, 7.3.4, and 7.3.6 can be combined into a single “if andonly if” theorem. What is the statement of this theorem? Now that wehave established dual conditions that are equivalent to diagonalizability, wecan elaborate upon the procedure for finding an eigenbasis that was outlinedbriefly in the last section and alluded to in the exercises.


A Procedure Diagonalizing a Transformation

In this subsection, we will provide a detailed description of the process ofdiagonalizing a linear transformation. We discuss each step in the context ofworking with a specific example. Let T : R3 −→ R3 be a linear transforma-tion defined by

T (〈x1, x2, x3〉) = 〈15x1 + 7x2 − 7x3,−x1 + x2 + x3, 13x1 + 7x2 − 5x3〉 .

1. Find the matrix representation of T . In this case, we find the matrixrepresentation with respect to the coordinate basis, which is15 7 −7

−1 1 113 7 −5

.

2. Find the eigenvalues of T . This involves completing the series of stepsinvolving the characteristic polynomial, which are given below.∣∣∣∣∣∣15 7 −7−1 1 113 7 −5

− t1 0 0

0 1 00 0 1

∣∣∣∣∣∣ =

∣∣∣∣∣∣15− t 7 −7−1 1− t 113 7 −5− t

∣∣∣∣∣∣= −(t− 1)(t− 8)(t− 2).

Since the polynomial splits, one requirement of diagonalizability hasbeen satisfied. What would have happened if the polynomial had notsplit? Would we be able to construct an eigenbasis? Why, or why not?For this example, the eigenvalues are 1, 2, and 8. Each has algebraicmultiplicity 1.

3. Find a basis for each eigenspace. If a is an eigenvalue, x is an eigen-vector with eigenvalue a if and only if x is a solution of the matrixequation

([T ]C − aI3) · x = 0.

λ = 1:

([T ]C − (1)I3) · x = 015 7 −7−1 1 113 7 −5

− (1)

1 0 00 1 00 0 1

x1

x2

x3

=

000

.


This yields the following system of equations,

14x1 + 7x2 − 7x3 = 0

−x1 + 0x2 + x3 = 0

13x1 + 7x2 − 6x3 = 0,

whose solution set is{〈r,−r, r〉 : r ∈ R}.

{〈1,−1, 1〉} is a basis for the eigenspace of λ = 1. The geometricmultiplicity of this eigenspace is obviously 1.

λ = 2:

([T ]C − (2)I3) · x = 0

Solution Set = {〈0, r, r〉 : r ∈ R}.

{〈0, 1, 1〉} is a basis for the eigenspace of λ = 2. The geometric multi-plicity of this eigenspace is obviously 1. Can you fill in the details?

λ = 8:

([T ]C − (8)I3) · x = 0

Solution Set = {〈r, 0, r〉 : r ∈ R}.

{〈1, 0, 1〉} is a basis for the eigenspace of λ = 8. The geometric multi-plicity of this eigenspace is obviously 1. Can you fill in the details?

Since the characteristic polynomials splits, and since the geometric mul-tiplicities add to the dimension of R3,

{〈1,−1, 1〉 , 〈0, 1, 1〉 , 〈1, 0, 1〉}

forms an eigenbasis for R3. What would have happened if one or moreof the eigenvalues had geometric multiplicities that failed to equal theircorresponding algebraic multiplicities? Would the transformation bediagonalizable?


4. Find the matrix representation with respect to the eigenbasis:

T (〈1,−1, 1〉) = 〈1,−1, 1〉T (〈0, 1, 1〉) = 〈0, 1, 1〉T (〈1, 0, 1〉) = 8 〈1, 0, 1〉 .

Therefore,

[T ]{〈1,−1,1〉,〈0,1,1〉,〈1,0,1〉} =

1 0 00 1 00 0 8

.

What is the change of basis matrix from the coordinate basis to theeigenbasis given here? How would we find this diagonal form, if wewere only given the change of basis matrix?

Now that we know how to find a diagonal form, the next issue is to see how itapplies. This will be the focus of the remainder of this and the next section.

Using Diagonalization to Solve a System of DifferentialEquations

Consider the system of differential equations

f ′1 = 3f1 + f2 + f3

f ′2 = 2f1 + 4f2 + 2f3

f ′3 = −f1 − f2 + f3,

where each fi : R −→ R, i = 1, 2, 3, is to be a differentiable function. Onesolution of this system is that each fi is the zero function. However, we wishto find all solutions. Let F : R −→ R3 be given by

F (x) = 〈f1(x), f2(x), f3(x)〉 .

Since each fi is differentiable, F is differentiable, and its derivative is givenby

F ′(x) = 〈f ′1(x), f ′2(x), f ′3(x)〉 .

F ′ and F are related by the matrix equation

F ′(x) = T · F (x),


where

T =

3 1 12 4 2−1 −1 1

.

As you can see, this is nothing more than the original system given above.We can show that the matrix given in the equation is diagonalizable. Itsdiagonal form with respect to the basis

B = {〈1, 0,−1〉 , 〈0, 1,−1〉 , 〈1, 2,−1〉}

is

[T ]B =

2 0 00 2 00 0 4

.

We will use this diagonal form to help us find the solution set of this systemof equations. The transition matrix from the coordinate basis to B is givenby the inverse of

M =

1 0 10 1 2−1 −1 −1

.

According to Theorem 7.1.2,

T = M · [T ]B ·M−1.

It then follows that

F ′(x) = T · F (x) = M · [T ]B ·M−1 · F (x),

which is equivalent to

M−1 · F ′(x) = [T ]B ·M−1 · F (x).

If we let G : R→ R3 be given by

G(x) =

g1(x)g2(x)g3(x)

= M−1 · F (x),

then G is differentiable and

G′(x) =

g′1(x)g′2(x)g′3(x)

= M−1 · F ′(x).


SinceF ′(x) = M · [T ]B ·M−1 · F (x),

G′(x) = M−1) · F ′(x)

= [T ]B ·M−1 · F (x)

= [T ]B ·G(x)

=

2g1(x)2g2(x)4g3(x)

.

The three equations

g′1(x) = 2g1(x)

g′2(x) = 2g2(x)

g′3(x) = 4g3(x)

are independent of each other and can be solved individually. Their solutionsare given by

g1(x) = c1e2x

g2(x) = c2e2x

g3(x) = c3e4x.

Since G(x) = M−1 · F (x), it follows that F (x) = M ·G(x), which gives us

F (x) =

f1(x)f2(x)f3(x)

=

1 0 10 1 2−1 −1 −1

·c1e

2x

c2e2x

c3e4x

=

c1e2x + c3e

4x

c2e2x + 2c3e

2x

−c1e2x − c2e

2x − c3e4x

= e2x

c1

10−1

+ c2

01−1

+ e4x

c3

12−1

= e2xz1 + e4xz2,


where z1 and z2 represent arbitrary elements of the eigenspaces correspondingto the eigenvalues 2 and 4, respectively.

Markov Chains

If A is a square matrix, the notation Ak refers to the kth power of A, which,as you would expect, is the product of A with itself k times.

Ak = A · A · A · · ·A︸︷︷︸k times

.

In Activity 5, we showed how to compute the power of a matrix using di-agonalization. In particular, if A is similar to a diagonal matrix D, we cancompute any power k of A by computing the product

Ak = CDkC−1,

where C is the matrix whose columns are the components of each vector inthe eigenbasis, and Dk is found by raising each diagonal entry of D to thekth power.

Theorem 7.3.7. Let A be an n×n diagonalizable matrix with entries in R.If C is the transition matrix, and if D is the diagonal form with respect tothe eigenbasis, then, for any k,

Ak = CDkC−1,

where C is the matrix whose columns are the components of the vectors ofthe eigenbasis.


We will apply this theorem in the following example involving MarkovChains. Suppose we have two adjacent cities A and B in which both citymanagers wish to predict long term trends in the movement of populationbetween the two cities. Currently, 70% of the people in the two cities livein City A, while 30% live in City B. In a typical year, 20% of the people inCity A move to City B and 80% of the people remain in City A, while 10%of the people in City B move to city A, with 10% remaining in City B. If a.5% increase per year is expected for the two cities combined, what will be


the population in each city in 30 years, if the current combined populationis 150,000 people? We can first set up a table of migration data:

From City A From City BTo City A .8 .1To City B .2 .9

.

The initial population distribution is given by(.7.3

).

The proportion in City A after the first year will consist of 80% of the original70% plus 10% of the 30% from City B, that is,

Proportion in City A after 1 year = .8 · .7 + .1 · .3.

Similarly, the proportion in City B after the first year will consist of 90% ofthe original 30% plus 20% of the 70% from City A, that is,

Proportion in City B after 1 year = .2 · .7 + .9 · .3.

In terms of matrices, we have(.8 .1.2 .9

)·(.7.3

)=

(.8 · .7 + .1 · .3.2 · .7 + .9 · .3

)=

(.59.41

).

The proportion in City A and City B after year two will be given by(.8 .1.2 .9

)2

·(.7.3

).

Can you explain why? After 30 years, the proportions will be given by(.8 .1.2 .9

)30

·(.7.3

).

Since the matrix (.8 .1.2 .9

)


is diagonalizable, we can compute this product using Theorem 7.3.7. Theeigenvalues are .7 and 1. {−1, 1} is a basis for .7, and {−5, 1} is a basis for1. According to Theorem 7.3.7,(

.8 .1

.2 .9

)30

=

(−1 −51 1

)·(.7 00 1

)30

·(

14

54

−14−1

4

).

Using this equality, what is the proportion matrix after year 30? Using thegrowth assumption given at the beginning of the discussion of this problem,how many people will live in both cities combined after 30 years? How manywill live in City A? How many will live in City B?

Exercises


T (〈x1, x2〉) = 〈5x1 − 3x2, 3x1 − x2〉 .

Determine whether T is diagonalizable. If it is, find an eigenbasis,and find the transition matrix between the coordinate basis and theeigenbasis. If not, explain why.

2. Define F : R2 −→ R2 by

F (〈x1, x2〉) =

(−2 4−1 4

)·(x1

x2

).

Determine whether F is diagonalizable. If it is, find an eigenbasis,and find the transition matrix between the coordinate basis and theeigenbasis. If not, explain why.

3. Define H : R3 −→ R3 by

H(〈x1, x2, x3〉) =

1 0 0−2 1 2−2 0 3

·x1

x2

x3

.

Determine whether H is diagonalizable. If it is, find an eigenbasis,and find the transition matrix between the coordinate basis and theeigenbasis. If not, explain why.



T (〈x1, x2, x3〉) =

1 2 30 −1 21 1 0

·x1

x2

x3

.

Determine whether T is diagonalizable. If it is, find an eigenbasis,and find the transition matrix between the coordinate basis and theeigenbasis. If not, explain why.

5. Define G : R4 −→ R4 by

G(〈x1, x2, x3, x4〉) =

〈4x1 + 2x2 − 2x3 + 2x4, x1 + 3x2 + x3 − x4, 2x3, x1 + x2 − 3x3 + 5x4〉 .

Determine whether G is diagonalizable. If it is, find an eigenbasis,and find the transition matrix between the coordinate basis and theeigenbasis. If not, explain why.

6. Provide a proof of the first part of Theorem 7.3.1.

7. Justify each step of the equality given in the proof of Theorem 7.3.2.

8. Theorems 7.3.3, 7.3.4, and 7.3.6 can be combined into a single “if andonly if” theorem. What is the statement of this theorem?

9. Let P3(R) be the vector space of polynomials of degree 3 or less. DefineT : P3(R) −→ P3(R) by

T (p) = p′′ + p′

where p ∈ P3(R), p′′ is the second derivative of p, and p′ is thefirst derivative of p. Determine whether T is diagonalizable. If itis, find an eigenbasis, and find the transition matrix between the basis{1, x, x2, x3} and the eigenbasis. If not, explain why.

10. Prove that if A is a diagonal matrix, then its eigenvalues are the diag-onal elements.

11. Prove that if A is an upper triangular matrix, then its eigenvalues arethe diagonal elements.


12. Prove that λ = 0 is an eigenvalue of a matrix A if and only if A issingular.

13. Find the general solution of each system of differential equations.

(a)

f ′1 = f1 + f2

f ′2 = 3f1 − f2

(b)

f ′1 = 8f1 + 10f2

f ′2 = −5f1 − 7f2

(c)

f ′1 = f1 + f3

f ′2 = f2 + f3

f ′3 = 2f3

14. Let T : Rn −→ Rn be an invertible linear transformation, that is, atransformation that is both one-to-one and onto. Show that T is diag-onalizable if and only if its inverse T−1 : Rn −→ Rn is diagonalizable.

15. Let P2(R) be the vector space of polynomials of degree 2 or less. DefineF : P2(R) −→ P2(R) by

F (p) = p(0) + p(1) · (x+ x2)

where p ∈ P2(R), p(0) is the value of the polynomial evaluated atx = 0, and p(1) is the value of the polynomial evaluated at x = 1.Determine whether F is diagonalizable. If it is, find an eigenbasis,and find the transition matrix between the basis {1, x, x2} and theeigenbasis. If not, explain why.

16. Let A be a square matrix. A power of A, say An, is nothing more thana matrix product of n copies of A. Use this definition to answer thefollowing questions regarding various matrix polynomials in A and thediagonalizability of A.


(a) Show that if a matrix A is diagonalizable and all of its eigenvaluesare either 1 or -1, then A2 = I.

(b) Show that if a matrix A is diagonalizable and all of its eigenvaluesare either 1 or 0, then A2 = A.

(c) Show that if a matrix A is diagonalizable and all of its eigenvaluesare either 3 or -5, then A2 + 2A− 15 = 0.

(d) Can you think of a general statement for which the three previousexercises are special cases?

17. Prove that if A is diagonalizable with distinct eigenvalues λ1, λ2, . . . ,λn, then

|A| = λ1λ2 · · ·λn.

18. If A and B are similar matrices, prove that if A is diagonalizable, thenB is diagonalizable.

19. Provide a proof for Theorem 7.3.7.

20. Answer the questions posed at the end of the discussion of the MarkovChain example.

21. Construct a model of population flows between cities, suburbs, andnonmetropolitan areas of the U.S. Their respective populations in 1985were 60 million, 125 million, and 55 million. The matrix giving proba-bilities of the moves is

From City From Suburb From NonmetroTo City .96 .01 .015

To Suburb .03 .98 .005To Nonmetro .01 .01 .98

Predict the population that will live in each category in 2010, if thetotal population is assumed to be 350 million.

Documents

Learning Linear Algebra with ISETL