Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
MATH1051
CALCULUSAND
LINEAR ALGEBRA I
Semester 1, 2019
Lecture Workbook
How to use this workbook
This book should be taken to lectures, tutorials and lab sessions. There are exercises, definitions andexamples in the workbook for you to fill in. These will be covered in lectures, and you should write downall the information given.
The completed workbook will act as a study guide to assist you in working through assignments andpreparing for the the mid-semester and final exams. For this reason, it is very important to attendlectures.
The text for the calculus part of the course is Calculus (8th edition) by James Stewart. We often referto the text in this workbook, and many of the definitions, theorems, and examples come from the text.
There is no set text for the linear algebra part of the course. Elementary Linear Algebra (11th edition)by Howard Anton covers the material well.
For further information about the course, please go to Blackboard at
http://blackboard.elearning.uq.edu.au
c©Mathematics, School of Mathematics and Physics, The University of Queensland, Brisbane QLD 4072, Australia
Edited by Joseph Grotowski, Phil Isaac, Michael Jennings, Birgit Loch, Victor Scharaschkin, Mary Waterhouse, Poh Wah Hillock,
2019.
i
CONTENTS ii
Contents
0 MATLAB -18
0.1 Introduction to MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -18
0.2 Plotting Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -14
0.3 Vectors & Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -13
0.4 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -12
0.5 Eigenvalues & Eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -10
0.6 Plotting points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -10
0.7 Introduction to M-files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -9
0.8 Writing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -5
0.9 Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -4
0.10 Useful Commands for MATH1051 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -2
0.11 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -1
1 Numbers 1
1.1 Number systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Real number line and ordering on R . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Definition: Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.4 Absolute value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Complex Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.6 Polar form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.7 Euler’s formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2 Functions 10
2.1 Definition: Function, domain, range . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Convention (domain) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.4 Vertical line test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.5 Exponential functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.6 Composition of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.7 One-to-one (1-1) functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.8 Inverse Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.9 How to find f−1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.10 Logarithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.11 Natural logarithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.12 Inverse trigonometric functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
CONTENTS iii
3 Sequences 22
3.1 Formal Definition: Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.3 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 Theorem: Limit laws . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.5 Useful sequences to remember . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.6 Theorem: Squeeze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.7 The formal definition of a limit of a sequence . . . . . . . . . . . . . . . . . . . . . . . . . 27
4 Limits 29
4.1 Definition: Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.3 One-sided limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4 Theorem: Squeeze principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.5 Limits as x approaches infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.6 Some important limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5 Continuity 39
5.1 Definition of Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.2 Continuity on Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3 Properties of Continuous Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.4 The Intermediate Value Theorem (IVT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.5 Application of the IVT (Bisection Method) . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6 Derivatives 45
6.1 Tangents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Definition of Derivative . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.3 Differentiability implies Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.4 The Chain Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.5 Derivative of an Inverse Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
6.6 L’Hopital’s Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.7 Continuous Extension of Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.8 The Mean Value Theorem (MVT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
6.9 Increasing and Decreasing Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.10 Increasing/Decreasing Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6.11 Local Maxima and Minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6.12 The First Derivative Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.13 Higher Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.14 The Second Derivative Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.15 The Extreme Value Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
CONTENTS iv
7 Series 61
7.1 Infinite sums (notation) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
7.3 The Harmonic Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
7.4 Definition of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.5 The p-test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.6 The Divergence Test (also called the nth term test) . . . . . . . . . . . . . . . . . . . . . . 66
7.7 Geometric series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.8 Application: Bouncing ball . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.9 The Comparison Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.10 Alternating Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.11 Absolute and conditional convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.12 The Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8 Power series and Taylor series 78
8.1 Definition: Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
8.2 Radius of convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8.3 Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
8.4 The Formula for Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
8.5 New series from old . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
8.6 Binomial Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9 Integration 90
9.1 Antiderivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
9.2 Indefinite Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
9.3 Area Under a Curve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
9.4 The Fundamental Theorem of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
9.5 Volume of Revolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
9.6 Improper Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.7 Techniques of Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
9.8 Integrals Involving the ln Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.9 Partial Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
10 Vectors 111
10.1 Row and column vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
10.2 Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
10.3 The Projection Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
10.4 Vectors in Rn . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
10.5 Properties of Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
CONTENTS v
11 Matrices and Linear Transformations 121
11.1 2× 2 matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
11.2 The effect of matrix multiplication on vectors . . . . . . . . . . . . . . . . . . . . . . . . . 121
11.3 Composing Transformations & Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . 122
11.4 Definition: Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.5 Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.6 Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
11.7 Scalar Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.8 Matrix Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
11.9 Transposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
12 Gaussian elimination 129
12.1 Simultaneous equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
12.2 Gaussian elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
12.3 The general solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
12.4 Gauss-Jordan elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
13 Inverses 144
13.1 The Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
13.2 Definition: Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
13.3 Algorithm to find the inverse matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
13.4 Justification of the algorithm: Left and Right inveres . . . . . . . . . . . . . . . . . . . . . 151
13.5 Uniqueness of the inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
13.6 Properties equivalent to invertibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
14 Determinants 156
14.1 Definition: Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
14.2 Properties of Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
14.3 Connection with inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
15 Vector Products in 3-Space 165
15.1 Definition: Cross product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
15.2 Application: area of a triangle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
15.3 Scalar Triple Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
15.4 Geometrical Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
CONTENTS vi
16 Eigenvalues and eigenvectors 174
16.1 Ranking Webpages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
16.2 Geometry of eigenvectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
16.3 Eigenvalues & Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
16.4 How to find eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
17 Vector Spaces 183
17.1 Linear combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
17.2 Linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
17.3 How to test for linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185
17.4 Invertible matrices and linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . 187
17.5 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
17.6 Eigenspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
17.7 The span of a set of vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
17.8 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
17.9 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
17.10Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
17.11Further properties of bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200
18 Review 205
18.1 Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
18.2 Position Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
18.3 Definition: Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
18.4 Vector Addition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
18.5 Scalar multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
18.6 Unit Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
18.7 Vectors in 3 dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
18.8 Row and column vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
18.9 Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
18.10Trigonometric functions (sin, cos, tan) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
19 Appendix - Practice Problems 217
19.1 Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
19.2 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
19.3 Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
19.4 Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
19.5 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
CONTENTS vii
19.6 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
19.7 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
19.8 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
19.9 Power series and Taylor series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
19.10Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
19.11Matrix Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
20 Appendix - Mathematical notation 231
21 Appendix - Basics 232
21.1 Powers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
21.2 Multiplication/Addition of real numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
21.3 Fractions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
21.4 Solving quadratic equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
21.5 Surds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
0.1 Introduction to MATLAB -18
0 MATLAB
0.1 Introduction to MATLAB
0.1.1 Getting Started
• Log in using your library username and password. Ask your tutor if you have a problem.
• Open MATLAB just like any other Windows program.Start→Programs →MATLAB→MATLAB R2007a
• Save all files to your student directory on H:/. Files saved onto individual computers or S:/ willbe deleted at the end of semester.
0.1.2 The MATLAB Program Window
The MATLAB program window consists of the command window, command history, current directoryand a menu bar for opening and editing M-files.
The command window is the main window we will use. Simple calculations can be performed in thecommand window. Executing functions, defining variables, viewing answers and information about errorswhich may have occurred also all happen in the command window.
A history of the commands entered into the command window can be found in the command history.Selecting a command from this window will re-execute the command in the command window.
The current directory contains a list of files that MATLAB can utilize. This is also the folder in whichMATLAB will save your work. The current directory should be set to your student account on H:/.
0.1.3 Getting Help
There are a number of resources available should you need assistance when using MATLAB. The mosteasily accessible is MATLAB’s own built-in Help system.
0.1 Introduction to MATLAB -17
• Clicking the blue question mark button in the toolbar will open the Help window, as will typingdoc in the command window. You can search for the topic in question or navigate using the panelon the left.
• If you know the name of the function or operation you need help with, typing help <function
name> in the command window will display a short description of the function and how to use it.Typing doc <function name> will open the Help window and navigate to the page dedicated tothat function or operation.
If you have trouble finding what you need with MATLAB’s Help system, there is a great deal of docu-mentation and useful articles available online. The Mathworks website
http://www.mathworks.com/products/matlab/
has a lot of helpful descriptions and solutions which go beyond what is available in the MATLAB Help,as well as m-files written and uploaded by users. A Google search often turns up results as well.
Practical sessions will rely heavily on the following resources:
• http://www-mdp.eng.cam.ac.uk/web/CD/engapps/octave/octavetut.pdf
• http://www.cs.ubc.ca/ nando/cpsc530a/handouts/cam matlab.pdf
Of course, your tutors and lecturers will be able to help you out during class or consultation times, andthe First Year Learning Centre is open from 2–4pm in 67–641 every day.
0.1.4 Variables & Basic Commands
Variables in MATLAB are names for mathematical objects (eg. numbers, vectors, matrices) whichare defined by the user, stored in MATLAB’s memory (for the current session), and can be used incalculations. Variables can be defined (assigned values) by using = . For example, the commands
>> x = 3
x =
3
>> y = 1009.781
y =
1009.781
assign the value 3 to the variable x and the value 1009.781 to the variable y.
Variable names can be letters or words, and can contain numbers, so long as they begin with a letter(eg. a1 is a valid variable name, but 1a is not). Variable names are case sensitive (eg. MATLAB treatsx and X as different variables).
Using the command clear followed by the name of a variable will remove that variable from MATLAB’smemory. Using clear all will clear all variables at once. Note that it is not necessary to clear a variablebefore redefining it. However, it is often a good idea to clear a variable after it is used to avoid mistakenlyusing an incorrect value in future calculations.
MATLAB can be used in much the same way as a normal calculator for evaluating mathematical expres-sions and functions. Operations can be performed on numbers or variables (as long as the variables havealready been defined). Some operations, functions and constants are saved permanently in MATLAB.We demonstrate a number of these here; a more thorough list can be found in the Appendix.
0.1 Introduction to MATLAB -16
As with any computer language, commands in MATLAB must follow certain syntax rules. If you get anerror message after trying to execute a command, check that you have not made a mistake when typingthe command (eg. brackets, apostrophes, commas).
The MATLAB commands for basic mathematical operations are exactly what one would expect. Addi-tion, subtraction, multiplication and division are all demonstrated below.
>> x = 10
x =
10
>> y = 5;
>> x+y
ans =
15
>> x-y
ans =
5
>> x*y
ans =
50
>> x/y
ans =
2
Note that after entering the command x = 10 MATLAB provided feedback, but after the command y =
5; no feedback was produced. Following a command with a semicolon will stop MATLAB from displayingany output, though the command itself will still be executed.
Some other common operations (given below) include exponentiation, square root, log and sin. Note thatlog is the natural logarithm (base e), not base 10.
>> x = 10
x =
10
>> y = 5
y =
5
>> x^y
ans =
100000
>> sqrt(x)
Tip 1: Entering a variable name as a command will return the value of that variable.
Tip 2: To remove a variable from MATLAB’s memory, use the command clear followed by the variablename. To clear all variables at once, use clear all.
Tip 3: For help with any MATLAB command simply enter help followed by the command name (eg.help acos).
Tip 4: By default MATLAB will display numbers to 4 decimal places. The command format long
changes this to 15 decimal places. To revert back to 4 decimal places, use format short.
Tip 5: Following any command with a semicolon will stop MATLAB from displaying any output (thoughthe operation will still be performed).
0.1 Introduction to MATLAB -15
ans =
3.1623
>> log(x)
ans =
2.3026
>> sin(x)
ans =
-0.5440
>> exp(x)
ans =
2.2026e+004
The last answer above demonstrates MATLAB’s use of scientific notation: it is the number 2.2026× 104.Do not confuse the ‘e’ with Euler’s number e = 2.718 . . .; to obtain this number, use the commandexp(1).
MATLAB has a number of other constants built-in, such as π and i. For example, the commands
>> x = 3*pi
x =
9.4248
>> y = log(x)
y =
2.2433
>> clear all
assign the value 3π to the variable x, set y to be ln(x), and then clear MATLAB’s memory of all variablevalues.
Check the Appendix for a more complete list of commonly used functions and constants.
Division of a non-zero number by 0 will return the result Inf or -Inf (representative of +∞ and −∞respectively). Any other mathematically undefined operation will give the result NaN, which stands forNot-a-Number. For example,
>> 0/0
ans =
NaN
In normal circumstances, Inf and NaN should be considered errors.
Tip 6: Pressing the up arrow in the command window will scroll up through previously executedcommands.
Tip 7: The result of the most recently executed command is stored as the variable ans.
Tip 8: The command who gives a list of all currently defined variables.
Tip 9: Some operations that might normally be considered undefined (eg. log(−1) or arccos(2)) willreturn complex answers in MATLAB. The reasons for this are not covered in MATH1051 (see courses oncomplex analysis for details).
0.2 Plotting Functions -14
0.2 Plotting Functions
0.2.1 General ezplot Commands
MATLAB has a number of built-in commands to make plotting functions very easy. MATLAB plotsfunctions in a separate Figure window which will automatically open. To plot f(x) = sin(x) overMATLAB’s default domain −2π < x < 2π, use the command
>> ezplot(’sin(x)’)
In general, to plot an arbitrary function f(x) type
>> ezplot(’f(x)’)
Note that the ’ ’ symbols cannot be left out here.
MATLAB will endeavour to automatically assign a sensible range to the dependent variable if it is notspecified. The domain and range can, however, both be controlled by the user. The first command belowassigns the domain a ≤ x ≤ b, and the second assigns the same domain as well as range c ≤ f(x) ≤ d.
>> ezplot(’f(x)’,[a,b])
>> ezplot(’f(x)’,[a,b,c,d])
0.2.2 Labelling & Editing Figures
After a figure has been generated in MATLAB, it can be labelled and edited through the commandwindow (check MATLAB’s help file for instructions on how to do this) or through the menus in the figurewindow itself. Axis labels, titles and legends can be added and modified through the Insert menu. Toedit the plots themselves, use View → Property Editor.
0.3 Vectors & Matrices -13
0.3 Vectors & Matrices
0.3.1 Basic Vector & Matrix Operations
One of MATLAB’s strengths is its support for vectors and matrices. Performing operations with suchobjects is very easy in MATLAB, which has a wide range of built-in functions for manipulating them.The following commands demonstrate how to define the matrices and vectors
a =(1 3 4
), b =
407
, and A =
(2 40 1
)
>> a = [1,3,4]
a =
1 3 4
>> b = [4;0;7]
b =
4
0
7
>> A = [2,4;0,7]
A =
2 4
0 7
Note that commas separate entries in a row and semicolons separate rows.
The following MATLAB segment details the commands which can be used to take the transpose of matrixa; print the (1, 2) entry of matrix A; the inverse of A; and the product of A with itself.
>> a’
ans =
1
3
4
>> A(1,2)
ans =
4
>> inv(A)
ans =
0.5000 -0.2857
0 0.1429
>> A*A
ans =
4 36
0 49
Tip 10: To plot multiple graphs on the same axes, use the command hold on before plotting the firstone, and then hold off when finished.
Tip 11: A set of grid lines can be toggled on and off using the command grid, or through the PropertyEditor.
Tip 12: The name given to the independent variable in ezplot does not matter. For example,ezplot(’sin(x)’) and ezplot(’sin(t)’) will produce the same output.
0.4 Systems of Linear Equations -12
Some functions which are normally applied to single numbers can be applied to all the elements of amatrix at once. For example,
>> log(a)
ans =
0 1.0986 1.3863
Here, applying the log function to a vector gives the natural logarithm of each element of the vector.Some functions require a full stop be inserted into the expression to operate in this way; for example,you can square each element of a matrix as follows:
>> a.^2
ans =
1 9 16
In general, functions which have special definitions when applied to matrices (for example, multiplicationor exponentiation) will require the full stop, while functions which can only be applied to single numbers(like logarithms) will not. Consult MATLAB’s documentation to determine whether a function canoperate on matrices, and whether it requires the full stop operator. See the Appendix for a morethorough list of MATLAB operations which can be performed on matrices and vectors, as well as thecommands for some special matrices.
0.4 Systems of Linear Equations
You have seen how to create and manipulate matrices in MATLAB. Systems of linear equations can bewritten as matrix equations, so we can use MATLAB to easily solve such problems.
Consider the system of linear equations
2x + y − z = 1x + y − 2z = −1
2y − z = 3
You know from lectures that finding the solution to this is equivalent to solving the matrix equationAx = b, where
A =
2 1 −11 1 −20 2 −1
, b =
1−13
and x =
xyz
Remember that the commands
>> A = [2 1 -1; 1 1 -2; 0 2 -1]
A =
2 1 -1
1 1 -2
0 2 -1
>> b = [1; -1; 3]
Tip 13: When entering a matrix, commas separate entries within a row and semicolons start new rows.Commas can be replaced with spaces (eg. matrix A in the table could be entered as A=[2 4;0 1]).
Tip 14: In MATLAB, the column and row indices of vectors and matrices start at 1, not 0. So the valuein the upper left corner of a matrix A is A(1, 1), not A(0, 0).
0.4 Systems of Linear Equations -11
b =
1
-1
3
will enter the values of A and b into MATLAB.
0.4.1 Solution by Inversion
One method for solving a matrix equation Ax = b is by noting that x = A−1Ax = A−1b, as long as A−1
exists. So to solve the above system, use the command
>> x = inv(A)*b
x =
0.2000
2.4000
1.8000
after entering the values of A and b.
0.4.2 Solution by Gaussian Elimination
Another method for solving systems of linear equations is Gaussian elimination. The function rref
performs Gaussian elimination to find the reduced row echelon form of a matrix. After finding thereduced row echelon form of an augmented matrix (A|b), the solution to Ax = b can just be read off.
To augment two matrices in MATLAB (after first making sure that the dimensions match up correctly),simply put them in square brackets separated by a space or comma. For example, to augment A and babove and then solve Ax = b, use the commands
>> AugA = [A b]
AugA =
2 1 -1 1
1 1 -2 -1
0 2 -1 3
>> rref(AugA)
ans =
1 0 0 0.2000
0 1 0 2.4000
0 0 1 1.8000
Another way to solve a system using Gaussian elimination is to use MATLAB’s backslash operator.To solve Ax = b use the command
>> x = A \ b
x =
0.2000
2.4000
1.8000
Tip 15: The command AugA = [A b] concatenates (joins) matrices A and b horizontally, while AugA
= [A; b] concatenates A and b vertically. In both cases, the lengths of the sides being joined togethermust be the same.
0.6 Plotting points -10
0.5 Eigenvalues & Eigenvectors
MATLAB has a built in function called eig to calculate the eigenvalues of a matrix and give correspondingeigenvectors. For example, to calculate the eigenvalues and eigenvectors of the matrix
A =
(1 34 2
),
use the following commands.
>> A = [1 3; 4 2];
>> [V,D] = eig(A)
V =
-0.7071 -0.6000
0.7071 -0.8000
D =
-2 0
0 5
Here, the values on the diagonal of D are the eigenvalues of A, and the columns of V are the correspondingeigenvectors. So A has an eigenvalue λ = −2 with associated eigenvector x = [−0.7071, 0.7071]T andanother eigenvalue λ = 5 with associated eigenvector x = [−0.6,−0.8]T .
0.6 Plotting points
You have seen how to use the ezplot command to plot functions of one variable. There are manysituations where you may instead wish to plot a set of discrete points on a cartesian plane. MATLAB’splot command can be used to do this.
To plot a set of points {(x1, y1), (x2, y2), . . . , (xn, yn)}, use the command plot(x,y), where x = [x1, x2, . . . , xn]and y = [y1, y2, . . . , yn]. For example, to plot the points (0, 0), (1, 1), (2, 3), (4, 6), use the commands
>> x=[0,1,2,3];
>> y=[0,1,3,6];
>> plot(x,y)
Tip 16: Be careful when using the command [V,D] = eig(A). No matter what you call V and D, theone which comes first will be the matrix of eigenvectors, and the other will be the matrix of eigenvalues.
Tip 17: You can use hold on to simultaneously plot continuous functions with ezplot and discretepoints with plot.
0.7 Introduction to M-files -9
There is an optional third argument for plot which specifies the colour, type of point and type of linewhich will be used to plot the points. For example,
>> plot(x,y,’ro’)
will plot the points as red circles with no line joining them, and
>> plot(x,y,’mx-’)
will plot the points as magenta crosses joined by a solid line. Use the command help plot for a fullexplanation.
As with ezplot, the properties of a figure (eg. title, axis labels, colour) can be edited using the menusin the figure window.
0.7 Introduction to M-files
M-files are text files which contain sequences of operations which MATLAB will execute when the filesare run. In comparison to most programming languages, m-files are very easy to write, and are usuallybuilt around only a few core commands.
0.7.1 Creating & Opening M-files
To create a new m-file; either
• Click on the New M-File button in the menu bar; or
• Select File → New → M-File.
0.7 Introduction to M-files -8
You should save m-files you wish to keep in your H:/ directory.
To open a saved m-file:
• Use the drop-down menu at the top of the screen to navigate to the correct folder, and thendouble-click the file in the Current Directory panel;
• Click the Open file button in the menu bar; or
• Select File → Open.
0.7.2 Writing & Running M-files
All of the commands which we have thus far covered will work as part of an m-file. For example, createa new m-file, write the commands
x = sqrt(5);
y = exp(x)
and save the file as example.m. In the main MATLAB window, use the drop-down menu to navigate tothe folder where the file is saved, and then enter the command
>> example
y =
9.3565
in the command window. This illustrates the basic idea of using m-files: you write a sequence of com-mands, save them all together into an m-file, and then execute them all in order by calling the name ofthe m-file as a command. Of course, the above example is very simple, and would take less time to justevaluate in the command window.
M-files are far more useful when they are intended to be used over and over again, because the commandsonly need to be typed out once. For example, say we have a collection of numbers (which we can store asa vector), and we wish to perform some statistical analysis by calculating the size of the set, the mean,median, mode, range and standard deviation, and we also wish to sort the numbers into ascending order.MATLAB has built-in commands for all these operations. If we only have a single data set then wemay as well just type all the commands into the command window; however, if we have many sets thenwe would be typing the same commands over again for each set. We can save time by putting all thecommands into an m-file. Put the following into an m-file and save it as statanalysis.m.
% For a vector x of real numbers, outputs the size of the set x,
% the mean, median, mode, range and standard deviation of the data,
% and sorts the data into numerical order.
setsize = length(x)
meanvalue = mean(x)
medianvalue = median(x)
modevalue = mode(x)
range = max(x) - min(x)
standarddev = std(x)
sortedx = sort(x)
Now we can perform all the operations at once by entering the vector x and then running statanalysis.For example,
>> x = [1,4,10,-2,19,0,6,3,4,7];
>> statanalysis
0.7 Introduction to M-files -7
setsize =
10
meanvalue =
5.2000
medianvalue =
4
modevalue =
4
range =
21
standarddev =
5.9777
sortedx =
-2 0 1 3 4 4 6 7 10 19
If we now have another data set we wish to analyze, we only need to redefine x to be that set, and runstatanlysis again.
0.7.3 If Statements
While m-files can be used to save time by collecting together operations which would otherwise taketime to execute in the command window, there are a number of special commands which make m-filesconsiderably more powerful. The first one we will look at is the if statement. It has the basic form
if condition
commands
end
where condition is a true/false statement and commands is a set of operations for MATLAB to perform.These commands will be executed if condition is true; if they are false, MATLAB will do nothing. Forexample, a valid if statement might be
if x>=0
y=sqrt(x)
end
Here, if x is greater than or equal to 0, then y will be the square root of x. If x is negative then MATLABdoes nothing. If we wanted MATLAB to do something when x is negative, we could add a second if
statement. However, an else command can be inserted into the if statement to accommodate this:
if condition
1st commands
else
2nd commands
end
For example, let’s write a program to calculate the absolute value of a number. Create a new m-file andinsert the following code.
if x < 0
y=-x
else
y=x
end
Save the file as absval.m. Set the current directory to the folder containing the file, and enter thecommands
0.7 Introduction to M-files -6
>> x = -5;
>> absval
y =
5
If your conditions have more than two possibilities (eg. a piecewise function which evaluates differentlyover different intervals), then you can also make use of elseif statements.
if 1st condition
1st commands
elseif 2nd condition
2nd commands
else
3rd commands
end
You can use as many elseif statements as you like. As soon as a condition which is satisfied is found,the corresponding command will be executed, and no further conditions will be checked. For example,consider the following m-file (save it as checksign.m)
if x < 0
‘negative’
elseif x > 0
‘positive’
else
‘zero’
end
This will check whether x is negative, positive or zero, and will output a string (the phrase inside the ‘’) saying as much. For example,
>> x = -10;
>> checksign
ans =
negative
The condition in an if statement must be a boolean (true/false) expression. MATLAB has a number ofthese, the simplest of which are detailed in the table below.
== equal tov= not equal to< less than<= less than or equal to> greater than>= greater than or equal to
Tip 18: In MATLAB a word or phrase inside inverted commas (‘ ’) is known as a string. Stringscan be used to display messages or label figures and axes, among other things. Variables can take stringvalues, though of course they cannot then be used in mathematical operations.
0.8 Writing Functions -5
0.7.4 Good Habits
It is often the case that MATLAB code makes perfect sense to the author at the time of writing, butis incomprehensible to anyone else or even the author themselves later on. Here are a few guidelines toensure that your code can be understood by anyone reading it.
• Use intuitive and descriptive names for m-files, functions (see the next section) and variables.
• Indent the commands inside specific statements (see the if statements above) and loops (discussedin a later section).
• Add thorough comments to your code. Anything written after a % character on a line is a commentand will be ignored by MATLAB. Describe what an m-file does, the meaning of all its variables,and what is achieved by any statements or loops.
0.8 Writing Functions
The example described in the previous section (absval.m) is fine for calculating the absolute value of anumber, but every time we want to change numbers we have to redefine x, which may be undesirable.We can avoid this by converting the program into a function, so that we can input any value we likewithout having to change any variables. Open absval.m and change the first line so that the file readsas follows, and save it.
function y = absval(x)
if x< 0
y=-x;
else
y=x;
end
The first line of a function is very important. It gives the name of the function (here, absval; this mustmatch the name of the m-file itself), the name of any input variables (here, x) and the name of theoutput variable (here, y). This function can now be executed like any other MATLAB function (eg.sin or log). To calculate the absolute value of −8, use the command
>> absval(-8)
ans =
8
The value −8 will be assigned to the input variable x within the function, and the value of the outputvariable y will be returned by the function.
Let’s look at another example. Say we want to write a function which takes two n × n matrices A andB as input, and calculates (A+ 2B)2. The m-file would look as follows.
function y = matrixsumsquare(A,B)
y = (A+2*B)^2;
To evaluate the function on matrices A =
(1 34 0
)and B =
(2 51 1
), use the commands
Tip 19: When saving a m-file function, make sure you the m-file has exactly the same name as the nameof the function (ie. the name given in the first line of the m-file).
0.9 Loops -4
>> A = [1 3; 4 0];
>> B = [2 5; 1 1];
>> matrixsumsquare(A,B)
ans =
103 91
42 82
Note that as with any variables, A and B must be defined before matrixsumsquare can operate on them.
The only variables MATLAB can access while evaluating a function are those given as input to thefunction, or those which are defined within the function. For example, consider the function
function output = example(y)
output = x + y;
If you now try to evaluate this function by running the commands
>> x = 5;
>> example(10)
You will get an error, because the variable x is neither given as input for the function example, nordefined within example.m.
The reverse is also true – variables which are solely defined within functions will not be accessible outsideof the function, with the exception of the output variable (which will be output as ans).
Note that these caveats only apply to functions; m-files which are not functions can access any variablesdefined in the command window, and vice versa.
0.9 Loops
0.9.1 While Loops
In a previous section you were introduced to the if statement, where MATLAB executes a commandonly when a condition holds. A while loop has the same structure as an if statement:
while condition
commands
end
The difference is that while condition holds, commands is executed, and then the loop is restarted andcondition is checked again. The first time that condition does not hold MATLAB exits the loop andmoves on to the next line after end.
Consider the following example, which uses a while loop to calculate the factorial of a natural numbern, written n!. Recall that n! = n(n−1)(n−2) · · · 2 ·1. MATLAB already has a built-in factorial function(called factorial), but we will write our own. Note that we should not call the function factorial,since a function with that name is already defined. We instead use the name fact.
function output = fact(x)
% temp is a temporary working variable
temp=1;
while x>1
temp=temp*x;
x=x-1;
end
output=temp;
0.9 Loops -3
Here, temp acts as a partial product: it starts out at 1, we multiply it by x, and then call that valuetemp. We then decrease the value of x by 1, and then multiply temp by x again. This is repeated untilx takes the value 1, and we no longer need to multiply. The function’s output is then the final value oftemp.
To use our function, we simply call it with a positive integer as the argument:
>> fact(6)
ans =
720
Note that as long as condition holds and commands does not alter any of the values used in condition ,then the loop will continue running forever and MATLAB will crash. When you write a while loop,ensure that there will definitely be some point at which condition is false. If you do happen to executecode containing an infinite loop, use the command Ctrl+c to stop it.
0.9.2 For Loops
So far you have seen if statements and while loops, which execute commands when certain conditionshold. Sometimes, however, we simply want the commands to be executed a fixed number of times withoutdepending on any conditions. This situation is handled by a for loop, which has the following form:
for variable = vector of values
commands
end
For each of the values in the vector, variable is assigned that value and the commands are executed.This is best illustrated with an example. Consider the for loop
for j = [1,2,3,4,5]
t(j) = j^2;
end
Note that as this code is executed, we evaluate t(1), t(2), t(3), t(4), t(5), so the output t is a 1× 5 vector.In the first iteration of the loop, j takes the value 1, and then t(1) (that is, the first entry of the vector t)takes the value 12. In the next iteration, j takes the value 2, and t(2) takes the value 22. This continueson until j=5, after which the loop is exited.
We can use a for loop to write a factorial function, instead of a while loop as seen in the previoussection. Consider the following function:
function output = fact2(x)
% temp is a temporary working variable
temp=1;
for j=2:x
temp=temp*j;
end
output=temp;
Tip 20: The command Ctrl+c will stop MATLAB’s current operation.
Tip 21: There is an easier way to create the row vector [1, 2, 3, 4, 5], namely the command 1:5. Ingeneral, the command x:n:y will create the vector [x, x+ n, x+ 2n, x+ 3n, . . . , y]; that is, the numbersfrom x to y in increments of n. If the n part is left out (eg. with the command 1:5), then MATLABassumes n = 1.
0.10 Useful Commands for MATH1051 -2
Here, we defined the vector of values which j takes as 2:x. This is the vector [2, 3, . . . , x]; so the loopwill be evaluated for j = 2, j = 3, . . . , j = x. Again, temp is acting as a partial product. In the firstiteration of the for loop, j takes the value 2, and temp is redefined as temp*2. In the next iteration, jtakes the value 3, and temp is redefined as temp*3. This is repeated until temp has been multiplied byall the integers 2, 3, . . . , x, and it is this final value of temp which becomes the output.
Note that if either fact or fact2 had been evaluated with a negative number as input, they would bothreturn the value 1. This is because in both cases no iterations of the loop would be executed: in fact,the statement x > 1 would be false from the beginning, so the while loop is never entered; and in fact2,the vector 2:x would be empty, so the for loop is never entered. Hence the value of temp is never alteredfrom its initial value of 1.
0.10 Useful Commands for MATH1051
In this section we mention a few MATLAB commands which may be useful when completing assignmentsand/or other tasks in MATH1051.
0.10.1 The trapz Function
The trapz function uses the trapezoid method to evaluate the integral of a function. Recall that thetrapezoid method involves dividing up the region under a function into vertical strips, finding the valuesof the function where the strips meet, and using those values to calculate the area of the region.
The trapz function takes as its argument a vector, which it interprets as the set of values of a functionat the points where the vertical strips meet. By default, MATLAB assumes the strips are one unit wide.For example, to find the area under the function f(x) = ln(x) between x = 1 and x = 100, use thecommands
>> x=1:100;
>> y=log(x);
>> trapz(y)
ans =
361.4368
To use trapz when the width of the strips is not 1, simply multiply the final answer by the width ofthe strips. For example, to compute the same integral as above but with a strip width of 0.1, use thecommands
>> x=1:0.1:100;
>> y=log(x);
>> 0.1*trapz(y)
ans =
361.5162
0.10.2 fminsearch and fminbnd
The MATLAB functions fminsearch and fminbnd are used to find the minimum value of a function insome region. fminsearch takes as input the function itself and a starting point x0. For example, to findwhere the minimum value of f(x) = sin(x) near the value x0 = 0 occurs, use the command
>> fminsearch(’sin(x)’,0)
ans =
-1.5708
0.11 Appendix -1
The answer -1.5708 is of course approximately −π/2, which is where a minimum of sin(x) occurs. To findthe value of the function itself at the minimum, you can evaluate the function with a separate command(eg. sin(-1.5708)), or use the command
>> [xvalue,fvalue]=fminsearch(’sin(x)’,0)
xvalue =
-1.5708
fvalue =
-1.0000
The function fminbnd performs a similar task to fminsearch, except instead of starting at an initial guessx0, we provide an interval [a,b] inside which MATLAB will search for a minimum value. For example,to find where the minimum value of f(x) = x3 + 10x2 in the interval [−4, 4] occurs, use the command
>> fminbnd(’x^3+10*x^2’,-4,4)
ans =
-1.2543e-005
The answer −1.2543 × 10−5 is very close to 0, and it easily be verified analytically that the minimumvalue of this function on the interval [−4, 4] does occur at 0 (where the function takes the value 0). Aswith fminsearch, to find the value of the function at this minimum, we can use the command
>> [xvalue,fvalue]=fminbnd(’x^3+10*x^2’,-4,4)
xvalue =
-1.2543e-005
fvalue =
1.5732e-009
0.11 Appendix
0.11.1 Common Numerical Operations & Constants
The following table lists the MATLAB commands for a number of commonly used functions which operateon numbers, as well as some constants. Use MATLAB’s help command for more information.
Operation MATLAB Operation MATLAB Constant MATLABCode Code Code
x+ y x+y ex exp(x) π pi
x− y x-y |x| abs(x)√−1 i
xy x*y sin(x) sin(x) e exp(1)
x/y x/y cos(x) cos(x) ∞ Inf
xn x^n tan(x) tan(x) −∞ -Inf√x sqrt(x) arcsin(x) asin(x)
ln(x) log(x) arccos(x) acos(x)
log10(x) log10(x) arctan(x) atan(x)
x× 10n xen n! factorial(n)
0.11.2 Vector & Matrix Operations
The following table lists the MATLAB commands for a number of commonly used operations which canbe performed on vectors and/or matrices.
0.11 Appendix 0
Operation MATLAB Code
Entering a row vector: a =(1 3 4
)a=[1,3,4]
Entering a column vector: b =
407
b=[4;0;7]
Entering a matrix: A =
(2 40 1
)A=[2,4;0,1]
Second component of vector b: b(2)
Entry (1, 2) of a matrix A: A(1,2)
Transpose: c = aT c=a’
Adding vectors or matrices: b + c b+c
Dot product: b · c dot(b,c)
Cross product: b× c cross(b,c)
Norm: ‖a‖ norm(a)
Multiplying matrices: AB A*B
Matrix determinant: det(A) det(A)
Inverse: A−1 inv(A)
Vector dimension: length(a)
Matrix dimensions: size(A)
Minimum value in vector b: min(b)
Maximum value in vector b: max(b)
0.11.3 Special Matrices
MATLAB has a number of special matrices built-in, and these can be produced with certain commands,given in the table below.
Matrix MATLAB Code
n× n identity matrix eye(n)
m× n matrix of ones ones(m,n)
m× n matrix of zeros zeros(m,n)
1.3 Definition: Intervals 1
1 Numbers
Mathematics uses structures at a fundamental level. These basic structures include numbers, sets (e.g.intervals), shapes, vectors, complex numbers, quaternions and many more strange and exotic things.Numbers are very important in mathematics as building blocks as they lead to equations and inequalitieswhich we can use as a tool to solve many abstract and applied problems.
1.1 Number systems
R :
The following are commonly used subsets of R:
N :
Z :
Q :
Irrational numbers are real numbers which cannot be represented as a ratio of integers. For example,√2,√
3,√
5, π = 3.14159 . . ., e = 2.71828 . . . are all irrational. Note: proving rationality or irrationalityof a given number can be quite subtle!
1.2 Real number line and ordering on R
The real number system can be visualised by imagining each real number as a point on a line, with thepositive direction being to the right, and an arbitrary origin being chosen to represent 0 (see Figure 1).
0 1 2 3−1−2−3
−1.5 πe
Figure 1: The real number line.
The real numbers are ordered, i.e. given any 2 real numbers a and b there holds precisely one of thefollowing: a > b, a < b or a = b. This means we can use the symbols ‘<’, ‘>’, ‘≤’ and ‘≥’ to writestatements such as 1 ≤ 2 and 3 >
√3. Geometrically, a < b means that a lies to the left of b on the real
number line. Note also that a ≤ b means that either a < b or a = b.
1.3 Definition: Intervals
An interval is a set of real numbers that can be thought of as a segment of the real number line. Fora < b, the open interval from a to b is given by
(a, b) = {x ∈ R| a < x < b}.
This notation is interpreted as
1.4 Absolute value 2
Note here that the end points are not included. If we wanted to include the endpoints, we would have aclosed interval from a to b, denoted
In this notation it is important to distinguish between the round brackets (a, b) for an open interval(excluding the end points) and the square brackets [a, b] for a closed interval (including the end points).
We also have half-open intervals (or half-closed depending on how you feel), denoted
There are also infinite intervals such as
(−∞, a] = {x ∈ R | x ≤ a}, and
(−∞, a) = {x ∈ R | x < a}, and
[a,∞) = {x ∈ R | a ≤ x}.(a,∞) = {x ∈ R | a < x}.
R = (−∞,∞)
Note that ±∞ can never be included in an interval.
1.4 Absolute value
We define
|x| ={x, if x ≥ 0−x, if x < 0
1.4.1 Examples
|3| =
| − 4| =
|1− π| =
1.5 Complex Numbers 3
1.4.2 Properties of absolute value
For example in (iv), set a = 1 and b = −2. We have l.h.s = |1 + (−2)| = | − 1| = 1,r.h.s = |1|+ | − 2| = 1 + 2 = 3.
1.4.3 Convention for√
For a > 0,√a always denotes the positive solution of x2 = a. Thus
√4 = 2 and so on. This means that
we can now solve x2 = a. For a > 0, solutions to x2 = a are x = ±√a.
1.5 Complex Numbers
Complex numbers were introduced in the 16th century to obtain roots of polynomial equations. A complexnumber is of the form
z = x+ iy
where x, y ∈ R and i is (formally) a symbol satisfying i2 = −1. The quantity x is called the real part ofz and y is called the imaginary part of z.
The set of all complex numbers is denoted C. Eg 3− 2i ∈ C.
1.5.1 Example
The real part of 3− 2i is 3 and the imaginary part is −2 (not −2i).
Complex numbers can be added and multiplied by replacing i2 everywhere with −1. For example (2i)2 =4i2 = −4.
1.5.2 Example
Simplify (3− 2i)(1 + i).
1.5 Complex Numbers 4
(3− 2i)(1 + i) = 3 + 3i− 2i− 2i2
= 3 + i+ 2 = 5 + i
(3− 2i)(1 + i) = 3 + 3i− 2i− 2i2
= 3 + i + 2 = 5 + i
1.5.3 Example
Suppose a, b ∈ R. Simplify (a+ bi)(a− bi).
(a+ bi)(a− bi) = a2 − abi+ abi− b2i2
= a2 + b2
(a + bi)(a− bi) = a2 − abi + abi− b2i2
= a2
+ b2
If z = a+ bi is a complex number, the number a− bi is called the complex conjugate of z, denoted z. Egthe complex conjugate of 3 + 2i is 3 + 2i = 3− 2i.
The previous example shows that z z is always a real number.
1.5.4 Example
Simplify3− 2i
1− i.
1.5 Complex Numbers 5
We multiply top and bottom by the complex conjugate of the denominator. This does not change thevalue of the fraction, but the new denominator is a real number.
3− 2i
1− i=
3− 2i
1− i× (1 + i)
(1 + i)
=(3− 2i)(1 + i)
12 + 12
=1
2(3− 2i)(1 + i)
=1
2(5 + i)
=5
2+
1
2i
We multiply top and bottom by the complex conjugate of the denominator. This does not change the value of the fraction, but the new denominator is areal number.
3− 2i
1− i=
3− 2i
1− i×
(1 + i)
(1 + i)
=(3− 2i)(1 + i)
12 + 12
=1
2(3− 2i)(1 + i)
=1
2(5 + i)
=5
2+
1
2i
It is a fact that if we consider complex roots of polynomials (and count them with their correct mul-tiplicity), then a polynomial of degree n always has n roots. For example, every quadratic has tworoots.
1.5.5 Example
Find the roots of x2 + 2x+ 2 = 0.
1.6 Polar form 6
Use the quadratic formula:
x =1
2(−2±
√4− 8) =
1
2(−2± 2i)
= −1± i.
Alternatively, complete the square. Take half the coefficient of the linear term 2x, namely 1. So consider(x+ 1)2:
x2 + 2x+ 2 = (x+ 1)2 + 1 = 0
⇐⇒ (x+ 1)2 = −1 = i2
⇐⇒ x+ 1 = ±i∴ x = −1± i are the roots.
Use the quadratic formula:
x =1
2(−2±
√4− 8) =
1
2(−2± 2i)
= −1± i.
Alternatively, complete the square. Take half the coefficient of the linear term 2x, namely 1. So consider (x + 1)2:
x2
+ 2x + 2 = (x + 1)2
+ 1 = 0
⇐⇒ (x + 1)2
= −1 = i2
⇐⇒ x + 1 = ±i∴ x = −1± i are the roots.
1.6 Polar form
Real numbers are often represented on the real line. A complex number z = x+ iy may be representedby a point in the complex plane, where the horizontal axis is the real axis and the vertical axis is theimaginary axis.
We can also specify z by giving the length r and the angle θ in figure 2. The quantity r is called themodulus of z, denoted |z|. It measures the distance of z from the origin. The angle θ is called theargument of z. We have:
1.7 Euler’s formula 7
x
y
Real
Imaginary
θ
z=x+iy
r
Figure 2: A complex number can be represented in rectangular or polar form.
x = r cos θ
y = r sin θ
⇒ z = x+ iy = r(cos θ + i sin θ).
Also r = |z| =√x2 + y2
tan θ =y
xif x 6= 0.
x = r cos θ
y = r sin θ
⇒ z = x + iy = r(cos θ + i sin θ).
Also r = |z| =
√x2 + y2
tan θ =y
xif x 6= 0.
1.6.1 Example
Write z = 1 + i in polar form.
First find the modulus:|z| =
√1 + 1 =
√2.
From the figure, the argument is π4 .
First find the modulus:|z| =
√1 + 1 =
√2.
From the figure, the argument is π4
.
1.7 Euler’s formula
Euler’s formula states for any real number θ:
1.7 Euler’s formula 8
cos θ + i sin θ = eiθ.
(To make sense of this, one has to define the exponential function for complex arguments. This may bedone using a series.)
Thus every complex number z = x+ iy can be represented in polar form
z = reiθ.
1.7 Euler’s formula 9
Notes
2.3 Convention (domain) 10
2 Functions
2.1 Definition: Function, domain, range
Let X and Y be subsets of R. A function f : X → Y is a rule which assigns to every element x ∈ Xexactly one element f(x) ∈ Y called the value of f at x. Here X is called the domain of f and
f(X) = {f(x)| x ∈ X}
is called the range of f , also written range(f).
The range of f , f(X), is a subset of Y . The range is the set of all possible values of f(x) as x variesthroughout the domain. Note that f(X) is not necessarily equal to all of Y .
2.1.1 Example
The function f : R→ R such that f(x) = x2 is a function with domain R. The range is given by
f(R) = {f(x) | x ∈ R}
=
=
2.1.2 Example
The following is an example of a piecewise defined function, defined by different rules on different partsof its domain: g : (−6, 7)→ R
g(x) =
−5, −6 < x < 0−π, x = 0x, 0 < x < 7
Piecewise defined functions are discussed in Stewart, p. 15
Clearly the domain of g is the open interval (−6, 7), but what about the range? Looking at the algebraicexpression for g given above, the range can be expressed as
2.2 Graphs
We can represent a function by drawing its graph which is the set of all points (x, y) in a plane wherey = f(x).
2.3 Convention (domain)
An expression like “the function y =√
1− x2 ” means the function f with y = f(x) =√
1− x2. Whenthe domain is not specified it is taken to be the largest subset of R on which the rule is defined (and givesa real output). In this example, the domain would be [−1, 1].
2.4 Vertical line test 11
2.4 Vertical line test
Not every curve represents the graph of a function. The crucial function property states that for eachvalue x in the domain there must correspond exactly one value y in the range. Thus in the graph of afunction, any vertical line x =constant must cut the graph in at most one point.
The graph for the circle x2+y2 = 1 is given in Figure 5. We can clearly see that the vertical line intersectsthe circle at two points. In this case the two y values are given by y = ±
√1− x2. Therefore x2 + y2 = 1
does not give rise to a function on any domain intersecting (−1, 1).
x
y
1
1
Figure 5: The circle x2 + y2 = 1 with a vertical line intersecting at two points. This shows that theequation for a circle does not give rise to a function.
2.4.1 Example
Both y =√
1− x2 (the top half of the circle) and y = −√
1− x2 (the bottom half of the circle) arefunctions of x. What are the domains and ranges of these two functions?
2.5 Exponential functions 12
2.4.2 Example
Look at the graphs in Figures 6(a) and 6(b). Do either of these graphs represent functions?
x
y
x
y
(a) (b)
Figure 6: Do either of these two graphs represent functions?
2.5 Exponential functions
An exponential function is one of the form f(x) = ax, where the base a is a positive constant, and x issaid to be the exponent or power. One very common exponential function which we shall see often in thiscourse is given by f(x) = ex. It has the graph shown in Figure 7. Notice that it cuts the y-axis (the linex = 0) at y = 1.
−1.5 −1 −0.5 0 0.5 1 1.5−4
−3
−2
−1
0
1
2
3
4
5
6
x
exp(x)
Figure 7: The function f(x) = ex.
2.6 Composition of functions 13
Exponential functions are very useful for modelling many natural phenomena such as population growth(base a > 1) and radioactive decay (base 0 < a < 1).
2.5.1 Example
The half-life of the isotope strontium-90, 90Sr, is 29 years. This means that half of any quantity of 90Srwill disintegrate in 29 years. Say the initial mass of a sample is 24mg, write an expression for the massremaining after t years.
Let m(t) be the mass remaining at time t, and m(0) = m0.Find m(t) as a function of t.
h = half-life: m(h) =m0
2
So m(2h) = m04 , m(3h) = m0
8 , . . . , m(nh) = m02n .
Hence m(xh) = m02x = m0 · 2−x, for x ∈ R, x ≥ 0.
So m(t) = m(t
hh) =
m0
2t/h= m0 · 2−t/h.
For the current problem, m0 = 24mg, h = 29 years, so m(t) = 24 · 2−t/29mg.Let m(t) be the mass remaining at time t, and m(0) = m0.Find m(t) as a function of t.
h = half-life: m(h) =m0
2
So m(2h) =m04, m(3h) =
m08, . . . , m(nh) =
m02n
.
Hence m(xh) =m02x
= m0 · 2−x, for x ∈ R, x ≥ 0.
So m(t) = m(t
hh) =
m0
2t/h= m0 · 2
−t/h.
For the current problem, m0 = 24mg, h = 29 years, so m(t) = 24 · 2−t/29mg.
2.6 Composition of functions
Let f and g be two functions. Then the composition of f and g, denoted f ◦ g, is the function defined by
2.6.1 Example
f(x) = x2 + 1, g(x) =1
x. Their compositions f ◦ g and g ◦ f are given by
2.7 One-to-one (1-1) functions 14
Notice how these two composite functions are not equal. In general, f ◦ g 6= g ◦ f .
2.7 One-to-one (1-1) functions
(Stewart, p. 400) A function f : X → Y is said to be one-to-one (usually written as 1-1) or injective if,for all x1, x2 ∈ X,
On the graph of f , the 1-1 property holds exactly if any horizontal line y =constant cuts through thecurve in at most one place; see Figure 8.
x
y
x
y
(a) (b)
Figure 8: Are either of these functions 1-1?
2.7.1 Example
Show that the function f defined by f(x) = 3√
2x− 5 is 1-1.
2.8 Inverse Functions 15
The domain of f is R. We need to show that
f(x1) = f(x2) =⇒ x1 = x2 for any x1, x2 ∈ R.
But
f(x1) = f(x2) =⇒ 3√
2x1 − 5 = 3√
2x2 − 5
=⇒ 2x1 − 5 = 2x2 − 5
=⇒ 2x1 = 2x2
=⇒ x1 = x2
Therefore f is 1-1.
The domain of f is R. We need to show thatf(x1) = f(x2) =⇒ x1 = x2 for any x1, x2 ∈ R.
But
f(x1) = f(x2) =⇒ 3√
2x1 − 5 = 3√
2x2 − 5
=⇒ 2x1 − 5 = 2x2 − 5
=⇒ 2x1 = 2x2
=⇒ x1 = x2
Therefore f is 1-1.2.8 Inverse Functions
This material is covered in Stewart, pp. 401-406.
Let f : X → Y be a 1-1 function. For each y ∈ f(X) the range of f , there is a unique x with f(x) = y.
Define the inverse function f−1 : f(X)→ X by f−1(y) = that unique x ∈ X with f(x) = y.
So f−1(y) = x ⇐⇒ y = f(x).
The inverse function reverses the direction of the mapping. f : x 7→ y but f−1 : y 7→ x.
f−1(y) = x and y = f(x) so
f−1(f(x)
)= x for all x ∈ X (2.1)
& f(f−1(y)
)= y for all y ∈ f(X) (2.2)
dom f−1 = range f
range f−1 = dom f
dom f−1
= range f
range f−1
= dom f
f must be 1-1 in order that f−1 be a function.
2.9 How to find f−1 16
2.9 How to find f−1
f(x) = y =⇒ f−1(y) = x
∴ To find f−1 solve for x in terms of y.
2.9.1 Find f−1 if f : R→ R, f(x) = 3√
2x− 5
We saw previously that f is 1-1. So f−1 exists.dom(f) = R and f(R) = R f−1 : f(R)→ dom(f) so f−1 : R→ R.y = f(x) = 3
√2x− 5. Solve for x in terms of y :
y3 = 2x− 5 ∴ 2x = y3 + 5 ∴ x = 12
(y3 + 5).
So f−1(y) = 12
(y3 + 5
).
The name of the variable is irrelevant ∴ we can write
f−1(x) =1
2
(x3 + 5
).
We saw previously that f is 1-1. So f−1 exists.
dom(f) = R and f(R) = R f−1 : f(R)→ dom(f) so f−1 : R→ R.
y = f(x) = 3√2x− 5. Solve for x in terms of y :
y3 = 2x− 5 ∴ 2x = y3 + 5 ∴ x = 12
(y3 + 5).
So f−1(y) = 12
(y3 + 5
).
The name of the variable is irrelevant ∴ we can write
f−1
(x) =1
2
(x3
+ 5).
In order to obtain the graph of f−1(x), we reflect the graph of f(x) about the line y = x; for example,see Figure 9 below. The dashed line represents the graph of the inverse function, having been reflectedabout the line y = x.
x
y
y=f(x)
y=xy=f (x)−1
Figure 9: Graph of f−1(x).
2.9.2 Example
Draw the graph of the inverse function of f(x) = x3 (the cube root function f−1(x) = x1/3) on the sameaxes given below in Figure 10.
2.9.3 Example
f : R → R, f(x) = x2 is not 1-1 and therefore has no inverse. However, x ≥ 0 gives a 1-1 functionf : [0,∞)→ R, f(x) = x2 with range [0,∞). The inverse of this function is then f−1 : [0,∞)→ [0,∞),f−1(x) =
√x. Similarly the negative half of the function f(x) = x2 is 1-1, with inverse f−1 : [0,∞) →
(−∞, 0], f−1(x) = −√x.
2.10 Logarithms 17
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−2
−1.5
−1
−0.5
0
0.5
1
1.5
2
x
x3
Figure 10: f(x) = x3. Draw in the graph of f−1(x) = x1/3.
This technique is often used when the function is not 1-1 over its entire domain: just take a part whereit is 1-1 and determine the inverse for that part.
For further discussion on how to find the inverse function of a 1-1 function, see Stewart, p. 400.
2.10 Logarithms
(Stewart, p. 421) Logarithms are the inverse functions of the exponential functions.
−4 −3 −2 −1 0 1 2 3 4−1
0
1
2
3
4
5
6
7
8
9
10
x
ax
2x
3x 4x
Figure 11: Three plots of y = ax with a = 2, 3, 4.
From the graph of y = ax (a 6= 1 a positive constant), we see that it is 1-1 and thus has an inverse,denoted loga x. From this definition we have the following facts:
loga(ax) = x ∀x ∈ R,
aloga x = x ∀x > 0.
What is the domain and range of f(x) = loga(x)?
2.11 Natural logarithm 18
2.11 Natural logarithm
Now we set a = e (Euler’s number = 2.71828. . .). The inverse function of f(x) = ex is
loge(x) ≡ lnx.
−3 −2 −1 0 1 2 3 4 5 6−3
−2
−1
0
1
2
3
4
5
6
x
exp(x)
log(x)
Figure 12: Graphs of f(x) = ex and g(x) = lnx.
2.11.1 Properties
Using exponent laws, together with the fact that lnx is the inverse function of ex, we can prove thefollowing.
2.12 Inverse trigonometric functions 19
Prove property (1)
Let ln(x) = a. By definition of inverse function this means ea = x.Let ln(y) = b, so eb = y. It follows that ln(xy) = ln(eaeb) = ln(ea+b) = a+ b = ln(x) + ln(y).Thus ln(xy) = ln(x) + ln(y). The other properties are proved similarly. Let ln(x) = a. By definition of inverse function
this means ea = x.Let ln(y) = b, so eb = y. It follows that ln(xy) = ln(eaeb) = ln(ea+b) = a + b = ln(x) + ln(y).
Thus ln(xy) = ln(x) + ln(y). The other properties are proved similarly.
2.11.2 Example: Bacteria population
(Stewart p. 427 # 45) If a bacteria population starts with 100 bacteria and doubles every 3 hours, thenthe number of bacteria n after t hours is given by the formula
n = f(t) = 100 · 2t/3.
(a) Find the inverse of this function and explain its meaning.
(b) When will the population reach 50000?
2.12 Inverse trigonometric functions
The function y = sinx is 1-1 if we just define it over the interval [−π/2, π/2]; see Figure 48. The inversefunction for this part of sinx is denoted arcsinx. Thus arcsinx is defined on the interval [−1, 1] and takesvalues in the range [−π/2, π/2]. The graph can easily be obtained by reflecting the graph of sinx aboutthe line y = x over the appropriate interval; see Figure 13.
Similarly y = cosx is 1-1 on the interval [0, π] and its inverse function is denoted arccosx. The functionarccosx is defined on [−1, 1] and takes values in the range [0, π]. Figure 14 below shows the graphs off(x) = cosx before and after reflection about the line y = x. This gives the graph of f−1(x) = arccosx.
2.12 Inverse trigonometric functions 20
−1.5 −1 −0.5 0 0.5 1 1.5−1.5
−1
−0.5
0
0.5
1
1.5
x
sin x
arcsin x
sin x, arcsin x
Figure 13: f(x) = sinx defined on [−π/2, π/2] reflected about the line y = x to give f−1(x) = arcsinx.
−1 −0.5 0 0.5 1 1.5 2 2.5 3−1
−0.5
0
0.5
1
1.5
2
2.5
3
x
cos x, acos x
cos x
arccos x
Figure 14: f(x) = cosx defined on [0, π] reflected about the line y = x to give f−1(x) = arccosx (writtenin MATLAB as acos(x)).
Also, tanx is 1-1 on the open interval (−π/2, π/2) with inverse function denoted by arctanx. Hencearctan has the domain (−∞,∞) with values in the range (−π/2, π/2).
−10 −5 0 5 10
−10
−8
−6
−4
−2
0
2
4
6
8
10
x
tan x, atan(x)
tan x
arctan x
Figure 15: tanx and arctanx.
2.12 Inverse trigonometric functions 21
Notes
3.2 Representations 22
3 Sequences
A sequence {an} is an ordered list of numbers
a0, a1, a2, a3, . . . , an, . . .
A sequence can contain a finite numbers of terms or may continue forever.
3.1 Formal Definition: Sequence
More formally, a sequence is a function, with domain being {0, 1, 2, 3, . . . }. We can also take the domainas {1, 2, 3, ...} and start the sequence at a1 rather than a0.
If a : {0, 1, 2, . . .} → R is a function, viewed as a sequence, then we write a0 instead of a(0), a1 insteadof a(1), etc.
3.1.1 Motivation
Infinite sequences of numbers are useful in many applications. The sequence might represent approxima-tions to the solution of a problem such as the bisection method. Or the sequence might represent a timeseries, where the numbers represent the population each year. Mathematically, we are interested in howthe sequence behaves as the number of terms becomes large: does it converge to a solution, or how fastdoes the population grow? Sequences can also appear from the partial sums in an infinite series, whichcomprises our next major topic.
3.2 Representations
There are two main ways to represent the nth term in a sequence. Firstly, there is a direct (or closedform or functional) representation. Secondly, there is a recursive (or indirect) representation.
A direct representation is a formula for an in terms of n. A recursive description gives a way of obtainingan from the previously calculated a0, a1, . . . , an−1.
Often in the recursive case the value of an will only depend on the previous 1 or two terms, such asan = f(an−1, an−2) or an = g(an−1).
3.2.1 Example
List the first terms of the sequence an =n
n+ 1.
3.2 Representations 23
3.2.2 Example
List the first terms of the recursive sequence an+1 =1
3− anwith a1 = 2.
Note that for the recursive form, you must express one or more initial conditions or starting values. Therecursive form is actually a difference equation, and there are similarities with differential equations thatare studied in MATH1052.
3.2.3 Example: Fibonacci sequence
The Fibonacci sequence is defined through
an = f(an−1, an−2) = an−1 + an−2
with a0 = 0, a1 = 1. List the first 9 terms.
3.3 Limits 24
3.3 Limits
Let {an}∞n=0 be a sequence. Then
limn→∞
an = ` , ` ∈ R , means :
an approaches ` as n gets larger and larger, ie. an is always close to ` for n sufficiently large.
3.3.1 Convention
If a sequence {an}∞n=0 has limit ` ∈ R, we say that an converges to ` and that the sequence {an}∞n=0 isconvergent. Otherwise the sequence is divergent.
3.3.2 Examples
Determine if the sequences {an}, with an as given below, are convergent and if so find their limits.
1. an =1
n
2. an =1
2an−1, a0 = −1.
3.4 Theorem: Limit laws 25
3. an = (−1)n
4. an = rn, for r =1
2, r = 1, r = 2
3.4 Theorem: Limit laws
The following limit laws apply provided that the separate limits exist (that is {an} and {bn} are conver-gent):
3.6 Theorem: Squeeze 26
3.4.1 Example
Give an example of two sequences an and bn such that limn→∞ an and limn→∞ bn do not exist, butlimn→∞(an + bn) does exist.
an = (n + 1/n) and bn = (1/n − n). Then limn→∞(an + bn) = limn→∞ 2/n = 0, but limn→∞ an andlimn→∞ bn do not exist. an = (n+ 1/n) and bn = (1/n− n). Then limn→∞(an + bn) = limn→∞ 2/n = 0, but limn→∞ an and limn→∞ bn
do not exist.
3.5 Useful sequences to remember
(1) For constant c, limn→∞
cn =
{0 , if |c| < 11 , if c = 1.
Sequence {cn}∞n=0 is divergent if c = −1 or |c| > 1.
(2) For constant c > 0, limn→∞
c1/n = limn→∞
n√c = 1.
(3) limn→∞
1
nr= 0 for r > 0.
(4) limn→∞
n1/n = limn→∞
n√n = 1.
(5) limn→∞
1
n!= 0.
(6) For constant c, limn→∞
cn
n!= 0.
(7) limn→∞
(1 +
1
n
)n= e.
(8) limn→∞
(1 +
a
n
)n= ea.
Take care with inequalities and limits. For example 1n > 0 for all n but lim
n→∞1n = 0. In general, even if
an > bn for all n, we can only conclude limn→∞
an ≥ limn→∞
bn. Note the ≥.
3.6 Theorem: Squeeze
If an ≤ bn ≤ cn for n ≥ n0 for some n0 ∈ N and limn→∞
an = limn→∞
cn = `, then
3.7 The formal definition of a limit of a sequence 27
3.6.1 Example
Use the squeeze theorem on {an}, where an =1
nsin(n).
3.7 The formal definition of a limit of a sequence
We writelimn→∞
an = `
if for every number ε > 0 there exists a number n0 such that∣∣an − `∣∣ < ε whenever n > n0.
For example limn→∞ an = 0 means |an| < ε whenever n > n0.
3.7 The formal definition of a limit of a sequence 28
Notes
4.1 Definition: Limit 29
4 Limits
Limits arise when we want to find the tangent to a curve or the velocity of an object, for example. Oncewe understand limits, we can proceed to studying continuity and calculus in general. You should beaware that limits are a fundamental notion to calculus, so it is important to understand them well.
4.1 Definition: Limit
(Stewart, p. 50) Let f(x) be a function and ` ∈ R. We say f(x) approaches the limit ` (or converges tothe limit `) as x approaches a if we can make the value of f(x) arbitrarily close to ` (as close to ` as welike) by taking x to be sufficiently close to a (on either side of a) but not equal to a.
We writelimx→a
f(x) = `.
Roughly speaking, f(x) is close to ` for all x values sufficiently close to a, with x 6= a. The limit “predicts”what should happen at x = a by looking at x values close to but not equal to a.
4.1.1 Some Basic Limits
• limx→a
1 = 1
• limx→a
x = a
We can sometimes determine the limit of a function by looking at its graph. Figure 16 shows two functionswith the same limit at x = 5.
2
x108642
4
Function 2Function 1
yy
14
12
10
8
6
4
2
x108642
14
12
10
8
6
Figure 16: Two functions with limit equal to 10 as x→ 5.
Function 1 is described by
f(x) =
{2x for 0 ≤ x ≤ 5,−2x+ 20 for 5 < x ≤ 10,
while function 2 is described by
g(x) =
2x for 0 ≤ x < 5,2 for x = 5,−2x+ 20 for 5 < x ≤ 10.
4.2 Properties 30
Each function has a limit of 10 as x approaches 5. It does not matter that the value of the secondfunction is 2 when x equals 5 since, when dealing with limits, we are only interested in the behaviour ofthe function as x approaches 5.
4.2 Properties
Suppose that c is a constant and the limits ` = limx→a
f(x) and m = limx→a
g(x) exist for some fixed a ∈ R.
Then
4.2 Properties 31
4.2.1 Example
Find the value of limx→1
(x2 + 1).
4.2.2 Example
(Stewart, p. 52 ex. 1) Determine the value of limx→1
x− 1
x2 − 1.
4.3 One-sided limits 32
4.2.3 Definition: Infinite Limits
Let f be a function defined on both sides of a, except possibly at a itself. Then
limx→a
f(x) =∞
means that the values of f(x) can be made arbitrarily large by taking x sufficiently close to a, but notequal to a.Similarly,
limx→a
f(x) = −∞
means that the values of f(x) can be made arbitrarily large negatively by taking x sufficiently close toa, but not equal to a.In these cases, we say that f(x) diverges to ±∞. We also say that the limit does not exist in these cases.Note that the limit properties in section 4.2 do not necessarily apply if the limits diverge.
4.3 One-sided limits
Consider the piecewise function
f(x) =
{1, x ≥ 0−2, x < 0.
This function has the graph depicted in Figure 17.
x
y
Figure 17: The limit of this function as x→ 0 does not exist.
Notice that limx→0, x>0
f(x) = 1, but limx→0, x<0
f(x) = −2. Therefore, the limit as x→ 0 does not exist. We
can, however, talk about the one-sided limits.
In the above example, we say that the limit as x → 0 from above (or from the right) equals 1 and wewrite
limx→0+
f(x) = 1.
4.3 One-sided limits 33
Similarly, we say that the limit as x→ 0 from below (or from the left) equals −2 and we write
limx→0−
f(x) = −2.
In general, for limx→a+
f(x) = `, just consider x with x > a and similarly for limx→a−
f(x) = `, consider only
x < a.
4.3.1 Example
Determine limx→2+
√x− 2.
4.3.2 Theorem
Let f be a function defined on some open interval that contains the number a, except possibly a itself.Then lim
x→af(x) = ` if and only if
4.3.3 Example
Find limx→1
f(x) where f(x) =
{x2, x ≥ 1,
2− x, x < 1.
4.4 Theorem: Squeeze principle 34
4.4 Theorem: Squeeze principle
Supposelimx→a
g(x) = ` = limx→a
h(x)
and, for x close to a (x 6= a)h(x) ≤ f(x) ≤ g(x).
Then
See the graph in Figure 18.
y
x
l
a
h(x)
g(x)
f(x)
Figure 18: The squeeze principle
4.4.1 Example
Prove that limx→0
x2 sin
(1
x
)= 0.
4.5 Limits as x approaches infinity 35
4.5 Limits as x approaches infinity
We say that f(x) approaches ` as x→∞ if f(x) is arbitrarily close to ` for all large enough values of x.Formally, we write:
That is, f(x) can be made arbitrarily close to ` by taking x sufficiently large. Similarly, we write
if f(x) approaches ` as x becomes more and more negative.
Note:
limx→∞
1
x= 0 and lim
x→−∞
1
x= 0.
−10 −5 0 5 10−10
−8
−6
−4
−2
0
2
4
6
8
10
x
1/x
Figure 19: Graph of f(x) = 1/x.
4.5.1 Example
Determine limx→∞
sinx, or show that it does not exist.
4.5 Limits as x approaches infinity 36
This limit does not exist since sinx is a periodic function that oscillates between −1 and 1.This limit does not exist since sin x is a periodic function that oscillates between −1 and 1.
4.5.2 Example: ratio of two polynomials, numerator degree = denominator degree
Find limx→∞
2x2 + 3
3x2 + x.
4.5.3 Example: ratio of two polynomials, numerator degree > denominator degree
Find limx→∞
x2 + 5
x+ 1.
The highest power of x in the denominator is 1, so we have
limx→∞
x2 + 5
x+ 1= lim
x→∞
x+ 5x
1 + 1x
=∞.
The highest power of x in the denominator is 1, so we have
limx→∞
x2 + 5
x + 1= limx→∞
x + 5x
1 + 1x
=∞.
4.5.4 Example: ratio of two polynomials, numerator degree < denominator degree
Find limx→∞
x+ 1
x2 + 1
4.6 Some important limits 37
limx→∞
x+ 1
x2 + 1= lim
x→∞
1x + 1
x2
1 + 1x2
= 0
limx→∞
x + 1
x2 + 1= limx→∞
1x
+ 1x2
1 + 1x2
= 0
The general method in the previous examples has been to:
1. divide the numerator and denominator by the highest power of x in the denominator; and then
2. determine the limits of the numerator and denominator separately.
4.6 Some important limits
The following limits are fundamental. We omit the proofs. Combined with the properties given in 4.2and the Squeeze Principle given in 4.4, these will enable you to compute a range of other limits.
(1) limx→0
sinx
x= 1 (2) lim
x→0
1− cos(x)
x2=
1
2(3) lim
x→0
ex − 1
x= 1.
4.6.1 Precise Definition
Let f be a function defined on some open interval that contains the number a, except possibly a itself.Then we write
limx→a
f(x) = `
if for every number ε > 0 there is a number δ > 0 such that
|f(x)− `| < ε whenever 0 < |x− a| < δ.
Detailed discussion on the precise definition of a limit can be found in Stewart, p. 72.
4.6 Some important limits 38
Notes
5.1 Definition of Continuity 39
5 Continuity
5.1 Definition of Continuity
We say that a function f is continuous at a if
If f is not continuous at a, we say that f is discontinuous at a, or f has a discontinuity at a.
f(a)f(a)
a
f(x)
y
y = f(x)
approaches
as x approaches a
x
Figure 20: Graphical representation of continuity at x = a.
A function may not be continuous at x = a for a number of reasons.
5.1.1 Example
Let
f(x) =
{x, x 6= 01, x = 0.
Then f has a discontinuity at x = 0. This is because f(0) = 1 while limx→0
f(x) = 0. Hence Condition (iii)
of Definition 5.1 does not hold. See Figure 21.
5.1 Definition of Continuity 40
x
y
Figure 21: An example of a discontinuous function with discontinuity at x = 0.
5.1.2 Example
The function f(x) =1
x2(see Figure 22) is not continuous at x = 0, since
−2 −1 0 1 2
0
1
2
3
4
5
x
1/(x2)
f(x)
Figure 22: The function f(x) = 1/x2 has a discontinuity at x = 0.
5.1.3 Example
f(x) =
{x+ 1, x ≥ 0x2, x < 0
is not continuous at x = 0 since
5.3 Properties of Continuous Functions 41
5.2 Continuity on Intervals
We say that f is continuous on the open interval (a, b), if
If f is continuous on the closed interval [a, b], then f is continuous on (a, b) and
You could think of a continuous function being one that on an interval can be drawn without lifting yourpen.
5.2.1 Examples
• Any polynomial in x is continuous on R. For example, f(x) = ax2 + bx+ c is continuous on R.
• ex, |x|, sinx, cosx and arctanx are continuous on R.
• f(x) = lnx is continuous on (0,∞).
5.3 Properties of Continuous Functions
If f(x) and g(x) are continuous at x = a and c is a constant, then
Since the function f(x) = x is continuous, this proves that any polynomial is continuous, and any ratioof polynomials is continuous, provided the denominator is not zero.
5.4 The Intermediate Value Theorem (IVT) 42
5.3.1 Limit of a Composite Function
(Stewart, p. 88 Thm 8) If f is continuous at b and limx→a
g(x) = b, then limx→a
f(g(x)) = f(b). In other words,
Similarly if limx→∞
g(x) = b and f is continuous at x = b, then
limx→∞
f(g(x)) = f(b).
5.3.2 Continuity of Composite Functions
(Stewart, p. 89 Thm 9) If g is continuous at a and f is continuous at g(a) then f ◦ g is continuous at a.
5.4 The Intermediate Value Theorem (IVT)
(Stewart, p. 90) Suppose that f is continuous on the closed interval [a, b] and let N be any numberbetween f(a) and f(b), where f(a) 6= f(b). Then there exists a number c ∈ (a, b) such that f(c) = N .
N
c
f(b)
f(a)
ba
y
x
Figure 23: The IVT states that a continuous function takes on every intermediate value between thefunction values f(a) and f(b). That is, we can choose any N between f(a) and f(b) which correspondsto a value c between a and b. Note that this is not necessarily true if f is discontinuous.
5.4.1 Example
Suppose that a function f is continuous everywhere and that f(−2) = 3,f(−1) = −1, f(0) = −4, f(1) = 1, and f(2) = 5. Does the Intermediate-Value Theorem guarantee thatf has a root on the following intervals?
a) [−2,−1] b) [−1, 0] c) [−1, 1] d) [0, 2] e) [1, 3]
5.5 Application of the IVT (Bisection Method) 43
5.5 Application of the IVT (Bisection Method)
The bisection method is a procedure for approximating the zeros of a continuous function. It first cutsthe interval [a, b] in half (say, at a point c), and then decides in which of the smaller intervals ([a, c]or [c, b]) the zero lies. This process is repeated until the interval is small enough to give a significantapproximation to the zero itself.
We can present this bisection method as an algorithm:
(1) Given [a, b] such that f(a)f(b) < 0, let c =a+ b
2.
(2) If f(c) = 0 then quit; c is a zero of f .
(3) If f(c) 6= 0 then:
(a) If f(a)f(c) < 0, a zero lies in the interval [a, c). So replace b by c.
(b) If f(a)f(c) > 0, replace a by c.
(4) If the interval [a, b] is small enough to give a precise enough approximation then quit. Otherwise,go to step (1).
y
x
f(a)
f(b)
a
b
Figure 24: The bisection method is an application of the IVT.
Note that:f(a)f(c) < 0 ⇐⇒ f(a) < 0 and f(c) > 0
or f(a) > 0 and f(c) < 0
f(a)f(c) > 0 ⇐⇒ f(a) > 0 and f(c) > 0
or f(a) < 0 and f(c) < 0
We will investigate the bisection method in more detail in the lab sessions with MATLAB.
5.5 Application of the IVT (Bisection Method) 44
Notes
6.1 Tangents 45
6 Derivatives
Finding the instantaneous velocity of a moving object and other problems involving rates of change aresituations where derivatives can be used as a powerful tool. All rates of change can be interpreted asslopes of appropriate tangents. Therefore we shall consider the tangent problem and how it leads to aprecise definition of the derivative.
6.1 Tangents
f(b) − f(a)
b = a + h
h
a
A=(a, f(a))
B=(b, f(b))
y=f(x)
y
x
Figure 25: Determine the tangent line at A.
Consider the graph in Figure 25. We want to determine the tangent line at the point A (x = a) on thegraph of y = f(x), where f(x) is ‘nice enough’. We approximate the tangent at A by the chord ABwhere B is a point on the curve (close to A) with x-value a + h, where h is small. By looking at thegraph, we can see that the slope m of the chord AB is given by
As the point B gets closer to A, m will become closer to the slope at the point A. To obtain this value,we take the limit as h→ 0:
6.2 Definition of Derivative 46
6.2 Definition of Derivative
The derivative of f at x is defined by
We say that f is differentiable at some point x if this limit exists. Further, we say that f is differentiableon an open interval if it is differentiable at every point in the interval. Note that f ′(a) is the slope of thetangent line to the graph of y = f(x) at x = a.
We have thus defined a new function f ′, called the derivative of f . Sometimes we use the Leibniz notationdy
dxor
df
dxin place of f ′(x).
Note that if f is differentiable at a, there holds:
f ′(a) = limx→a
f(x)− f(a)
x− a.
6.2.1 Example
Using the definition of the derivative (“from first principles”), find the derivative of f(x) = x2 + x.
6.2 Definition of Derivative 47
6.2.2 Example
Find the derivative from first principles of f(x) = ex.
6.2.3 Standard Properties of the Derivative
Recall the following:
d
dx(constant) = 0
d
dx(x) = 1
d
dx(xα) = αxα−1, for α ∈ R, x ≥ 0 or for x ∈ R if α ∈ R+
d
dx(sinx) = cosx
d
dx(cosx) = − sinx
(cf)′ = cf ′
(f ± g)′ = f ′ ± g′
(fg)′ = f ′g + fg′ (the product rule); and(f
g
)′=
f ′g − fg′
g2, whenever the denominator is not equal to 0 (the quotient rule).
6.3 Differentiability implies Continuity 48
6.3 Differentiability implies Continuity
If f is differentiable at a, then f is continuous at a.
Suppose f is differentiable at x = a.Recall that the limit of a product is the product of the two limits, if they both exist.
Thus limh→0
[f(a+ h)− f(a)
]= lim
h→0h · f(a+ h)− f(a)
h= lim
h→0h · lim
h→0
f(a+ h)− f(a)
h= 0 · f ′(a) = 0.
Hence limh→0
f(a+h) = f(a). Putting x = a+h gives saying limx→a
f(x) = f(a). So f is continuous at x = a.Suppose f is differentiable at x = a.
Recall that the limit of a product is the product of the two limits, if they both exist.
Thus limh→0
[f(a + h)− f(a)
]= limh→0
h ·f(a + h)− f(a)
h= limh→0
h · limh→0
f(a + h)− f(a)
h= 0 · f ′(a) = 0.
Hence limh→0
f(a + h) = f(a). Putting x = a + h gives saying limx→a
f(x) = f(a). So f is continuous at x = a.
The converse is false. That is, if f is continuous at a, then it is not necessarily differentiable at a:
6.5 Derivative of an Inverse Function 49
6.4 The Chain Rule
(Stewart, p. 152) If f and g are both differentiable f ◦ g is differentiable with derivative given by
In the Leibniz notation, if y = f(u) and u = g(x) are both differentiable functions, then
6.4.1 Example
Differentiate y =√esinx.
6.5 Derivative of an Inverse Function
Suppose y = f−1(x), where f−1 is the inverse of f . To obtaindy
dxwe use
x = f(f−1(x)) = f(y).
Differentiating both sides with respect to x using the chain rule gives
6.5 Derivative of an Inverse Function 50
6.5.1 Example
Find the derivative of y = lnx.
6.5.2 Example
Find the derivative of y = arcsinx
6.6 L’Hopital’s Rule 51
First note that the domain for the arcsin function is [−1, 1] and the range is [−π2 ,
π2 ].
We have x = sin y. Therefore on (−1, 1)
dy
dx=
1(dxdy
)=
1
cos y
(note cos(y) > 0 for y ∈ (−π
2,π
2))
=1√
1− sin2 y
=1√
1− x2.
Therefore,d
dx(arcsinx) =
1√1− x2
, x ∈ (−1, 1).
First note that the domain for the arcsin function is [−1, 1] and the range is [−π2, π
2].
We have x = sin y. Therefore on (−1, 1)
dy
dx=
1(dxdy
)=
1
cos y
(note cos(y) > 0 for y ∈ (−
π
2,π
2)
)=
1√1− sin2 y
=1√
1− x2.
Therefore,d
dx(arcsin x) =
1√1− x2
, x ∈ (−1, 1).
6.5.3 Other Inverse Trig Derivatives
6.6 L’Hopital’s Rule
(Stewart, p. 491) Suppose that f and g are differentiable and g′(x) 6= 0 near a (except possibly at a).Suppose that
limx→a
f(x) = 0 and limx→a
g(x) = 0
orlimx→a
f(x) = ±∞ and limx→a
g(x) = ±∞.
Then
6.6 L’Hopital’s Rule 52
if the limit on the right exists or is ±∞.
6.6.1 Example
Find limx→1
lnx
x− 1.
6.8 The Mean Value Theorem (MVT) 53
6.6.2 Example
Find limx→0+
x lnx.
6.7 Continuous Extension of Sequences
Sometimes L’Hopital’s rule can be used to evaluate limits of sequences. Let f be a function on the realnumbers such that lim
x→∞f(x) exists. Let f(n) = an for natural numbers n. Then:
6.7.1 Example
Evaluate limn→∞
lnnn .
6.8 The Mean Value Theorem (MVT)
(Stewart, p. 215) Let f be continuous on [a, b] and differentiable on (a, b). Then
f(b)− f(a)
b− a= f ′(c)
6.10 Increasing/Decreasing Test 54
for some c, where a < c < b.
Note f ′(c) is the slope of y = f(x) at x = c and f(b)−f(a)b−a is the slope of the chord joining A = (a, f(a))
to B = (b, f(b)).
6.9 Increasing and Decreasing Functions
A function f is called strictly increasing on an interval I if
while f is called strictly decreasing on I if
This leads to the following test.
6.10 Increasing/Decreasing Test
(Stewart, p. 221) Suppose that f is continuous on [a, b] and differentiable on (a, b).
6.11 Local Maxima and Minima 55
6.11 Local Maxima and Minima
A function f has a local maximum at a if
Similarly, f has a local minimum at b if
6.11.1 Critical Points
A function f is said to have a critical point at x = a, a ∈ dom(f) if
6.11.2 Global Maximum and Minimum
Let f be a function defined on the interval [a, b], and c ∈ [a, b].Then f has a global maximum at c if
Similarly f has a global minimum at c if
6.11.3 Example
Consider the function in Figure 26. The domain of this function is the interval [p, u].
The global maximum is at point s. The global minimum is at point p. Local maxima are at points q ands. Local minima are at points r and t.
We can use the derivative to find where the local max/min occur:
6.12 The First Derivative Test 56
utsrqp
Figure 26: What do the points x = p, q, r, s, t, u represent?
6.11.4 Using the Derivative to Find Max/Min
If f has a local maximum/minimum at x = c and f ′(c) exists then f ′(c) = 0.
Proof (for max):
6.12 The First Derivative Test
Let f be continuous on [a, b] and differentiable on (a, b), and let c ∈ (a, b).
(a) If f ′(x) > 0 for a < x < c and f ′(x) < 0 for c < x < b then f has a local maximum at c.
(b) If f ′(x) < 0 for a < x < c and f ′(x) > 0 for c < x < b then f has a local minimum at c.
Proof:
6.15 The Extreme Value Theorem 57
6.13 Higher Derivatives
Differentiating a function y = f(x) n times, if possible, gives the nth derivative of f , usually denoted
f (n)(x),dnf
dxnor
dny
dxn.
6.14 The Second Derivative Test
Suppose f ′′ exists at c.
(a) If f ′(c) = 0 and f ′′(c) > 0, then f has a local minimum at c.
(b) If f ′(c) = 0 and f ′′(c) < 0, then f has a local maximum at c.
(c) If f ′(c) = 0 and f ′′(c) = 0, then the test fails and we get no useful information. In this case the firstderivative test should be used.
6.15 The Extreme Value Theorem
(Stewart, p. 206) If f is continuous on a closed interval [a, b], then
6.15.1 Example
Find and classify the critical points of f(x) = (3− x)e3x−12x2
. Find the range of f .
6.15 The Extreme Value Theorem 58
f ′(x) = (−1)e3x−12x2
+ (3− x)d
dx
(e3x−
12x2)
= (−1)e3x−12x2
+ (3− x)d
dxe3x−
12x2 d
dx
(3x− 1
2x2)
= (−1)e3x−12x2
+ (3− x)e3x−12x2 d
dx
(3− x
)= e3x−
12x2(
(−1) + (3− x)(3− x))
= e3x−12x2
(x2 − 6x+ 8)
= e3x−12x2
(x− 2)(x− 4).
Critical points: x = 2, x = 4.
The derivative has the same sign as the quadratic (x− 2)(x− 4) so at x = 2 the sign changes + 0 −. Bythe first derivative test there is a local max at x = 2. Similarly at x = 4 there is a local min.Or use the second derivative test: f ′′(x) = −e3x−
12x2
(x − 3)((x − 3)2 − 3
). Check that f ′′(2) < 0 and
f ′′(4) > 0.
(2, e4) local max, (4,−e4) local min.
To find the range, we need to know what happens as x→ ±∞.By l’Hopital
limx→∞
f(x) = limx→∞
x− 3
e12x2−3x
= limx→∞
1
e12x2−3x · (3− x)
= 0
Similarly limx→−∞
f(x) = 0. So the local max is also the global max, and the same for the global min.
Alternative: for x ≥ 1/2 the continuous function is positive but decreasing so 0 < f(x) ≤ 2e−1/4, and forx ≤ −1/2 the function is non-positive but increasing so −e−1 ≤ f(x) ≤ 0.
Thus the range is [−e4, e4 ].
f′(x) = (−1)e
3x− 12x2
+ (3− x)d
dx
(e3x− 1
2x2)
= (−1)e3x− 1
2x2
+ (3− x)d
dxe3x− 1
2x2 d
dx
(3x−
1
2x2)
= (−1)e3x− 1
2x2
+ (3− x)e3x− 1
2x2 d
dx
(3− x
)= e
3x− 12x2((−1) + (3− x)(3− x)
)= e
3x− 12x2
(x2 − 6x + 8)
= e3x− 1
2x2
(x− 2)(x− 4).
Critical points: x = 2, x = 4.
The derivative has the same sign as the quadratic (x − 2)(x − 4) so at x = 2 the sign changes + 0 −. By the first derivative test there is a local max atx = 2. Similarly at x = 4 there is a local min.
Or use the second derivative test: f ′′(x) = −e3x−12x2
(x− 3)((x− 3)2 − 3
). Check that f ′′(2) < 0 and f ′′(4) > 0.
(2, e4) local max, (4,−e4) local min.
To find the range, we need to know what happens as x→ ±∞.By l’Hopital
limx→∞
f(x) = limx→∞
x− 3
e12x2−3x
= limx→∞
1
e12x2−3x · (3− x)
= 0
Similarly limx→−∞
f(x) = 0. So the local max is also the global max, and the same for the global min.
Alternative: for x ≥ 1/2 the continuous function is positive but decreasing so 0 < f(x) ≤ 2e−1/4, and for x ≤ −1/2 the function is non-positive but
increasing so −e−1 ≤ f(x) ≤ 0.
Thus the range is [−e4, e4 ].
6.15.2 Newton’s Method
Newton’s method is a numerical procedure to approximate the roots of an equation. Most programswhich find roots of equations are based on this method. For example, consider the equation
cosx = x.
6.15 The Extreme Value Theorem 59
How would you find the roots of this equation? It is impossible to solve analytically (i.e., precisely), butNewton’s method gives us a way of approximating the roots to some degree of accuracy.
The idea behind the geometry of this method is given in Figure 27. We must have a starting guess (thevalue x1), then we find the tangent line to the curve at that point, and find where the tangent line cutsthe x-axis. This gives us the next point. We repeat the process until we have a close enough estimate.
xx 12p
Figure 27: Starting at x1, this gives you an idea of how Newton’s method iterates to estimate the root p.
Mathematically, the slope of the tangent line to the curve at the point (x1, f(x1)) is given by
f ′(x1) =f(x1)− 0
x1 − x2⇒ x2 = x1 −
f(x1)
f ′(x1).
Repeating this procedure leads to the iterative formula
If the numbers xn become closer and closer to p as n becomes large, we say the sequence converges to p.
This method is very sensitive to the type of curve and the choice of starting point.
We will investigate Newton’s method in more detail in the lab sessions using MATLAB.
6.15 The Extreme Value Theorem 60
Notes
7.2 Motivation 61
7 Series
A finite series is a sum of finitely many terms a1 + a2 + · · ·+ an. An infinite series is a sum of infinitelymany terms a1 + a2 + a3 + · · · .
We shall see that if we add an infinite number of terms the result may be finite or infinite.
If the series has a finite sum, we say it converges.
Our task will be: Given a series, determine if it converges.
Whether or not a series converges is not obvious.
7.1 Infinite sums (notation)
If we have an infinite sum we write
a0 + a1 + a2 + . . .+ an + . . . =
∞∑n=0
an.
Note that the lower bound (n = 0) of the sum may vary.
7.2 Motivation
Series come from many fields.
1. Approximation to problem solutions:
• a0 zeroth order approximation
• a0 + a1 (a1 small) first order approximation
• a0 + a1 + a2 (a2 very small) second order approximation
• a0 + a1 + a2 + . . .+ an nth order
• a0 + a1 + a2 + . . .+ an + . . . exact solution, provided the series converges
2. Current state of a process over infinite time horizon
3. Approximating functions via Taylor/Fourier Series, e.g.,
ex = 1 + x+1
2x2 +
1
3!x3 + . . . ,
f(t) = c0 + c1 sin t+ c2 sin(2t) + c3 sin(3t) + . . .
4. Riemann sums
7.2.1 Example
Consider the series 1 +1
4+
1
9+
1
16+ · · · =
∞∑n=1
1
n2.
Work out the sum of the first few terms.
7.3 The Harmonic Series 62
1∑n=1
1
n2= 1
2∑n=1
1
n2= 1 +
1
4= 1.25
3∑n=1
1
n2= 1 +
1
4+
1
9' 1.36
4∑n=1
1
n2= 1 +
1
4+
1
9+
1
16' 1.43
10∑n=1
1
n2= 1 +
1
4+
1
9+
1
16+ · · ·+ 1
100' 1.55
NN∑n=1
1
n2
10 1.549768
100 1.634984
1000 1.643935
10000 1.644834
100000 1.644924
106 1.644933
107 1.644934
It appears
∞∑n=1
1
n2converges, to about 1.6449.
1∑n=1
1
n2= 1
2∑n=1
1
n2= 1 +
1
4= 1.25
3∑n=1
1
n2= 1 +
1
4+
1
9' 1.36
4∑n=1
1
n2= 1 +
1
4+
1
9+
1
16' 1.43
10∑n=1
1
n2= 1 +
1
4+
1
9+
1
16+ · · · +
1
100' 1.55
N
N∑n=1
1
n2
10 1.549768
100 1.634984
1000 1.643935
10000 1.644834
100000 1.644924
106 1.644933
107 1.644934
It appears
∞∑n=1
1
n2converges, to about 1.6449.
7.3 The Harmonic Series
The series 1 +1
2+
1
3+ · · · =
∞∑n=1
1
nis called the Harmonic Series.
Work out the sum of the first few terms
7.4 Definition of Convergence 63
1∑n=1
1
n= 1
2∑n=1
1
n= 1 +
1
2= 1.5
3∑n=1
1
n= 1 +
1
2+
1
3' 1.83
4∑n=1
1
n= 1 +
1
2+
1
3+
1
3+
1
4' 2.08
NN∑n=1
1
n
10 2.928968
100 5.187378
1000 7.485471
104 9.787606
105 12.090146
106 14.392727...
......
(one googol) 10100 230.836
It appears the harmonic series diverges (very slowly).1∑
n=1
1
n= 1
2∑n=1
1
n= 1 +
1
2= 1.5
3∑n=1
1
n= 1 +
1
2+
1
3' 1.83
4∑n=1
1
n= 1 +
1
2+
1
3+
1
3+
1
4' 2.08
NN∑n=1
1
n
10 2.928968
100 5.187378
1000 7.485471
104 9.787606
105 12.090146
106 14.392727
.
.
.
.
.
.
.
.
.
(one googol) 10100 230.836
It appears the harmonic series diverges (very slowly).
7.4 Definition of Convergence
(Stewart, p. 748) Given a series
∞∑n=0
an = a0 + a1 + a2 + . . ., let sn denote its nth partial sum:
7.4 Definition of Convergence 64
If the sequence {sn} is convergent (i.e. limn→∞
sn = s with s ∈ R), then the series∞∑n=0
an is said to be
convergent and we write
limn→∞
sn =∞∑n=0
an = s.
The number s is called the sum of the series. Otherwise the series is said to be divergent.
7.4.1 Example
Does the series∞∑n=0
(−1)n converge or diverge?
7.4.2 Example
Show that the series
∞∑n=0
1
(n+ 1)(n+ 2)is convergent.
7.5 The p-test 65
7.5 The p-test
For p ∈ R, the p-series
∞∑n=1
1
npis
Note the above sum is from n = 1. This is just a matter of taste since
∞∑n=1
1
np=∞∑n=0
1
(n+ 1)p
7.5.1 Intuition
If all the terms in the series stay “large”, then the series is divergent. If the individual terms are becomingsmaller then the series may converge, but not necessarily.
7.6 The Divergence Test (also called the nth term test) 66
7.5.2 The Harmonic Series Diverges
Write the partial sums
s1 = 1
s2 = 1 +1
2
s3 = 1 +1
2+
1
3
s4 = 1 +1
2+
1
3+
1
4
≥ 1 +1
2+
1
4+
1
4︸ ︷︷ ︸= 1 +
1
2+
1
2
s8 = 1 +1
2+
1
3+
1
4+
(1
5+
1
6+
1
7+
1
8
)≥ 1 +
1
2+
1
4+
1
4+
(1
8+
1
8+
1
8+
1
8
)= 1 +
1
2+
1
2+
1
2.
In general s2n ≥ 1 + n/2, so the partial sums approach infinity as n→∞.
7.6 The Divergence Test (also called the nth term test)
(Stewart, p. 753) If∞∑n=0
an is convergent then limn→∞
an = 0.
The divergence test is:
If limn→∞
an 6= 0 then the series is divergent.If limn→∞
an 6= 0 then the series is divergent.
7.6.1 Examples
Verify that these series diverge by the divergence test.
(i)
∞∑n=0
(−1)n
(ii)∞∑n=1
n2
5n2 + 4
7.7 Geometric series 67
7.6.2 Caution
If limn→∞
an = 0 then we cannot say whether the series is convergent or divergent.
For example,
limn→∞
1
n= 0
but the harmonic series∞∑n=1
1
n= 1 +
1
2+
1
3+
1
4+ . . .
diverges (p-series with p = 1).
7.7 Geometric series
The series∞∑n=0
arn
is convergent if |r| < 1 with
∞∑n=0
arn =a
1− rand divergent if |r| ≥ 1.
7.7 Geometric series 68
7.7.1 Proof
7.8 Application: Bouncing ball 69
7.8 Application: Bouncing ball
(Stewart, p. 757 # 74) A certain ball has the property that each time it falls from a height h onto ahard, level surface, it rebounds to a height rh, where 0 < r < 1. Suppose that the ball is dropped froman initial height of H metres.
(a) Find the total distance that the ball travels.
(b) Calculate the total time that the ball travels. Use the fact that the ball falls gt2/2 metres in t seconds(from classical physics).
7.9 The Comparison Test 70
7.9 The Comparison Test
(Stewart, p. 767) If we are trying to determine whether or not∞∑n=0
an converges, where an contains
complicated terms, then it may be possible to bound the series by using simpler terms. If |an| is always
smaller than |bn| and∞∑n=0
|bn| converges, then∞∑n=0
an should also. On the other hand, if an is always large
- in fact larger than |bn|, and
∞∑n=0
bn diverges, then
∞∑n=0
an should diverge too.
Suppose that
∞∑n=0
an and
∞∑n=0
bn are series with all non-negative terms.
7.9.1 Example
Test the series∞∑n=1
lnn
nfor convergence.
7.10 Alternating Series 71
7.9.2 Example
Test the series∞∑n=0
5
2 + 3nfor convergence.
7.10 Alternating Series
(Stewart, p. 772) An alternating seriesis a series whose terms alternate in sign (+,−,+,−, . . . or−,+,−,+, . . .).
Here is an interesting result about a special alternating series:
We have seen that the harmonic series
1 +1
2+
1
3+
1
4+ . . .
is divergent. However, the alternating series
1− 1
2+
1
3− 1
4+ . . .
is convergent. Why? Later, we shall see that this series converges to ln 2.
If the alternating series
∞∑n=0
(−1)nbn = b0 − b1 + b2 − b3 + . . . (all bn > 0)
satisfies
7.10 Alternating Series 72
then the series is convergent.
The alternating series test can only be used to show convergence, not divergence.
7.10.1 Example
∞∑n=0
(−1)n1√n+ 1
7.10.2 Example
∞∑n=1
(−1)n3n
4n− 1
7.11 Absolute and conditional convergence 73
7.11 Absolute and conditional convergence
7.11.1 Examples
Are the following series conditionally or absolutely convergent?
(i)
∞∑n=1
(−1)n1
n
(ii)∞∑n=1
(−1)n1
n4
7.11 Absolute and conditional convergence 74
(iii)
∞∑n=1
cosn
n2
7.11.2 Absolute convergence
If a series is absolutely convergent, then it is convergent.
7.12 The Ratio Test 75
7.11.3 Proof
7.12 The Ratio Test
(Stewart, p. 779) A powerful test for convergence of series is the ratio test. We will use it extensively inthe following sections.
1. If limn→∞
∣∣∣∣an+1
an
∣∣∣∣ = L < 1, then the series∞∑n=1
an is absolutely convergent (and therefore convergent).
2. If limn→∞
∣∣∣∣an+1
an
∣∣∣∣ = L > 1 or limn→∞
∣∣∣∣an+1
an
∣∣∣∣ =∞ , then the series∞∑n=1
an is divergent.
3. If limn→∞
∣∣∣∣an+1
an
∣∣∣∣ = 1, then the ratio test is inconclusive.
4. If limn→∞
∣∣∣∣an+1
an
∣∣∣∣ is not defined, then the ratio test is inconclusive.
Note that the ratio test only tests for absolute convergence. The ratio test does not give any estimatesof the sum.
7.12.1 Examples
Use the ratio test to determine, if possible, whether the following series are absolutely convergent.
(i)∞∑n=0
arn
7.12 The Ratio Test 76
(ii)∞∑n=1
(−1)nn3
3n
7.12 The Ratio Test 77
Notes
8.1 Definition: Power series 78
8 Power series and Taylor series
8.1 Definition: Power series
A power series is a series of the form
Such series have many applications, for example, solutions of higher order O.D.E’s (ordinary differentialequations) with non-constant coefficients, which in turn are very important to many areas of science andengineering.
8.1.1 Example: Bessel function
Use the ratio test to determine for which values of x the Bessel function of order zero (denoted J0(x))converges, where
J0(x) =∞∑n=0
(−1)nx2n
22n(n!)2.
Bessel functions first arose when Bessel solved Kepler’s equation for describing planetary motion. Sincethen, they have been used in applications such as temperature distribution on a circular plate and theshape of a vibrating membrane (such as the human ear).
The graph of J0(t) is given in Figure 28.
8.2 Radius of convergence 79
0 5 10 15 20 25 30−1
−0.5
0
0.5
1
1.5
t
besselj(0,t)
Figure 28: The Bessel function J0(t) is represented as besselj(0,t) in MATLAB.
8.2 Radius of convergence
(Stewart, p. 789) For a given power series∞∑n=0
cn(x− a)n there are only three possibilities:
In case (1), we say that the radius of convergence is 0, while in case (2) we say that the radius ofconvergence is ∞.
The interval of convergence of a power series is the interval that consists of all values of x for which theseries converges. In case (1) the interval is a single point, a. In case (2) the interval is (−∞,∞). In case(3) the interval is either (a− r, a+ r), (a− r, a+ r], [a− r, a+ r) or [a− r, a+ r].
8.2.1 Example
What is the radius of convergence for the series∞∑n=0
(−3)nxn√n+ 1
? For which values of x does the series
converge?
8.3 Taylor Series 80
8.3 Taylor Series
Given a function f it is sometimes possible to expand f as a power series
f(x) =∞∑n=0
cn(x− a)n = c0 + c1(x− a) + c2(x− a)2 + c3(x− a)3 + . . .
How do we find the coefficients cn in general? Let us assume that f and all its derivatives are defined atx = a.
First note thatf(a) = c0.
Now take the derivative of both sides. Assuming that we are justified in differentiating term by term onthe right-hand side, this gives
so thatf ′(a) = c1.
Taking the derivative again and again gives
f ′′(a) = 2c2 ⇒ c2 = f ′′(a)/2
f ′′′(a) = 6c3 ⇒ c3 = f ′′′(a)/3!
and, in generalcn = f (n)(a)/n!, with f (0)(x) ≡ f(x).
This leads to
8.4 The Formula for Taylor Series 81
The right-hand side is called the Taylor series of f(x) about x = a. So this result says that if f(x) hasa power series representation (expansion) at x = a, it must be given by the formula above.
When a = 0 in the Taylor series the series is sometimes called a MacLaurin series.
8.3.1 Note
The function f and all its derivatives must be defined at x = a for the series to exist. In this case we saythat f is infinitely differentiable at x = a.
8.4 The Formula for Taylor Series
Suppose f(x) is infinitely differentiable at x = a. Then for f “sufficiently well behaved”,
f(x) =
∞∑n=0
f (n)(a)
n!(x− a)n, for |x− a| < r,
where r is the radius of convergence.
(Note that if we define a sequence of numbers
bn = maxx∈[−r,r]
|f (n)(x)|, n ∈ N,
then the “sufficiently well behaved” means, among other things,
limn→∞
bnrn
n!= 0.
Unless otherwise noted, all functions we consider in this section are “sufficiently well behaved”.)
For |x− a| < r, we have approximately
f(x) 'k∑
n=0
f (n)(a)
n!(x− a)n
which is a useful polynomial approximation. The smaller |x− a|, the better the approximation. Thisapproximation can be made accurate to arbitrary order by taking k sufficiently large provided |x− a| < r.
This begins to address the question of whether or not f has a power series representation. The detailsare beyond the scope of this course. For those interested, see Stewart, pp. 800-802.
8.4.1 Example
Find the Taylor series of f(x) = lnx about x = 1. Determine its radius of convergence.
8.4 The Formula for Taylor Series 82
8.4.2 Example
Find the Taylor series of f(x) = ex about x = a. Determine its radius of convergence.
8.4 The Formula for Taylor Series 83
8.4.3 Example
Find the Taylor series for f(x) = sinx about x = 0. Determine its radius of convergence.
Similarly
cosx = 1− x2
2!+x4
4!− x6
6!+ · · ·
It is worth memorizing the series for ex, sinx and cosx.
8.5 New series from old 84
8.5 New series from old
Sometimes it is possible to find the Taylor series of f without computing all its derivatives, by comparingf to an already known series.
8.5.1 Example
Find the Taylor series expansions for the following functions about x = 0:
(1) f(x) = x2 cosx.
The derivatives of f become complicated. Eg. f ′′′(x) = −6 sin(x)− 6x cos(x) + x2 sin(x).Instead, just multiply the series for cosx by x2:
cosx = 1− x2
2!+x4
4!− x6
6!+ · · ·
∴ x2 cosx = x2(
1− x2
2!+x4
4!− x6
6!+ · · ·
)
= x2 − x4
2!+x6
4!− x8
6!+ · · ·
The derivatives of f become complicated. Eg. f ′′′(x) = −6 sin(x)− 6x cos(x) + x2 sin(x).
Instead, just multiply the series for cos x by x2:
cos x = 1−x2
2!+x4
4!−x6
6!+ · · ·
∴ x2
cos x = x2(1−
x2
2!+x4
4!−x6
6!+ · · ·
)
= x2 −
x4
2!+x6
4!−x8
6!+ · · ·
(2) f(x) = ex2.
8.5 New series from old 85
We know the Taylor series for et about t = 0:
et = 1 + t+t2
2!+t3
3!+t4
4!+ · · ·
Substitute x2 in place of t:
ex2
= 1 + x2 +(x2)2
2!+
(x2)3
3!+
(x2)4
4!+ · · ·
= 1 + x2 +x4
2!+x6
3!+x8
4!+ · · ·
We know the Taylor series for et about t = 0:
et
= 1 + t +t2
2!+t3
3!+t4
4!+ · · ·
Substitute x2 in place of t:
ex2
= 1 + x2
+(x2)2
2!+
(x2)3
3!+
(x2)4
4!+ · · ·
= 1 + x2
+x4
2!+x6
3!+x8
4!+ · · ·
8.6 Binomial Series 86
8.6 Binomial Series
For a non-negative integer m, the binomial series is
(1 + x)m =
m∑n=0
(m
n
)xn,
where
(m
n
)are the binomial coefficients defined by
(m
0
)= 1,
(m
n
)=m(m− 1) . . . (m− n+ 1)
n!for n = 1, 2, ....
Note that in this expression for the binomial coefficient, m can be any real number. In the case that m isa positive integer and 0 ≤ n ≤ m, such as in the above expansion, it can be written in the more familiarform (
m
n
)=
m!
(m− n)!n!.
Also note that when both m and n are positive integers, with n ≥ m+ 1 the binomial coefficient is zero.The above expansion can then be written
(1 + x)m =∞∑n=0
(m
n
)xn.
Generalize the binomial theorem to arbitrary real powers by finding the Taylor series expansion of thefunction f(x) = (1 + x)α about x = 0, where α ∈ R. Find also the radius of convergence of the series.
8.6 Binomial Series 87
f(x) = (1 + x)α ⇒ f ′(x) = α(1 + x)α−1
f ′′(x) = α(α− 1)(1 + x)α−2
and, in general,
f (n)(x) = α(α− 1) · · · · · · (α− n+ 1)(1 + x)α−n
⇒ f(0) = 1 , f (n)(0) = α(α− 1) · · · · · · (α− n+ 1), for n ≥ 1.
In terms of the binomial coefficient, we have
f (n)(0) = n!
(α
n
), for n = 0, 1, 2, . . .
Therefore the series expansion for f(x) is
∞∑n=0
f (n)(0)
n!xn =
∞∑n=0
(α
n
)xn.
Radius of convergence:
If α ∈ N, the series terminates after α+ 1 terms. A series with a finite number of terms is convergent forall real x. Note that in this case cn+1
cnis not defined (division by 0).
For α /∈ N the radius of convergence can be found by the ratio test
limn→∞
∣∣∣∣cn+1
cn
∣∣∣∣ = limn→∞
∣∣∣∣α(α− 1) . . . (α− n)xn+1
(n+ 1)!· n!
α(α− 1) . . . (α− n+ 1)xn
∣∣∣∣= lim
n→∞
∣∣∣∣x(α− n)
(n+ 1)
∣∣∣∣ = |x| limn→∞
∣∣∣∣∣ αn − 1
1 + 1n
∣∣∣∣∣ = |x|.
Hence the series converges for |x| < 1.
f(x) = (1 + x)α ⇒ f
′(x) = α(1 + x)
α−1
f′′
(x) = α(α− 1)(1 + x)α−2
and, in general,
f(n)(x) = α(α− 1) · · · · · · (α− n + 1)(1 + x)α−n
⇒ f(0) = 1 , f(n)(0) = α(α− 1) · · · · · · (α− n + 1), for n ≥ 1.
In terms of the binomial coefficient, we have
f(n)
(0) = n!
(α
n
), for n = 0, 1, 2, . . .
Therefore the series expansion for f(x) is∞∑n=0
f(n)(0)
n!xn
=∞∑n=0
(α
n
)xn.
Radius of convergence:
If α ∈ N, the series terminates after α + 1 terms. A series with a finite number of terms is convergent for all real x. Note that in this casecn+1cn
is not
defined (division by 0).For α /∈ N the radius of convergence can be found by the ratio test
limn→∞
∣∣∣∣ cn+1
cn
∣∣∣∣ = limn→∞
∣∣∣∣∣α(α− 1) . . . (α− n)xn+1
(n + 1)!·
n!
α(α− 1) . . . (α− n + 1)xn
∣∣∣∣∣= lim
n→∞
∣∣∣∣∣x (α− n)
(n + 1)
∣∣∣∣∣ = |x| limn→∞
∣∣∣∣∣αn− 1
1 + 1n
∣∣∣∣∣ = |x|.
Hence the series converges for |x| < 1.
8.6.1 Geometric Series from Taylor Series
A familiar series is the geometric series
1
1− x= 1 + x+ x2 + · · · , for |x| < 1.
This result follows from the binomial theorem. Taking α = −1 and replacing x with −x we have
8.6 Binomial Series 88
f(x) = (1− x)−1 =1
1− x.
The binomial coefficients in this case are
(−1
n
)=
(−1)(−2) . . . (−n)
n!= (−1)n, for n ≥ 1.
The binomial theorem then gives, for |x| < 1,
1
1− x= 1 +
∞∑n=1
(−1)n(−x)n =∞∑n=0
xn,
which is the geometric series.
8.6 Binomial Series 89
Notes
9.2 Indefinite Integrals 90
9 Integration
9.1 Antiderivatives
(Stewart, p. 278) A function F is called an antiderivative of f on an interval I if
See the table below for some examples of antiderivatives. We use the notation F ′ = f , G′ = g and c is aconstant.
Function Antiderivative
cf(x) cF (x)
f(x) + g(x) F (x) +G(x)
xα, (α 6= −1)xα+1
α+ 1
sinx − cosx
Function Antiderivative
cosx sinx
sec2 x tanx
1
xln |x|
ex ex
The most general antiderivative can be obtained from those given in the table by adding a constant. Notice
that ifd
dxF (x) = f(x) then
d
dx(F (x) + C) = f(x) is also true where C is any constant (independent of x).
9.1.1 Example
If f ′′(x) = x−√x, find f(x).
9.2 Indefinite Integrals
The indefinite integral
∫f(x)dx, of a suitable function f , is defined by
where F is any antiderivative of f and c is an arbitrary constant called the constant of integration. Thus∫f(x)dx gives all the antiderivatives of f .
9.3 Area Under a Curve 91
9.3 Area Under a Curve
(Stewart, pp. 294-301) Consider the problem of finding the area under a curve given by the graph of acontinuous, non-negative function y = f(x) between two points x = a and x = b; see Figure 29.
y=f(x)
S
a b
Figure 29: We need to define the area of the region S = {(x, y)|a ≤ x ≤ b, 0 ≤ y ≤ f(x)}.
Although we might have an intuitive notion of what we mean by area, how do we define the area undersuch a curve?
We start by subdividing S into n strips S1, S2, . . . , Sn of equal width. The width of the interval [a, b] isb− a so the width of each strip is
These strips divide the interval [a, b] into n subintervals
[x0, x1], [x1, x2], . . . , [xn−1, xn]
where x0 = a and xn = b. If ci is any point in the ith subinterval [xi−1, xi], then we can approximatethe ith strip Si by a rectangle of width ∆x and height f(ci), which is the value of f at the point ci (seeFigure 30).
y=f(x)
a b2 3 4 61 5
S
c c c c c c∆x
Figure 30: A method of approximating the area of the region S.
The area of such a rectangle is f(ci)∆x. Therefore we can approximate the area of S by taking the sumof the area of these rectangles. We call this sum Rn:
9.3 Area Under a Curve 92
Now what happens if we increase n? Consider the graph in Figure 31. It seems that as ∆x becomessmaller (when n becomes larger), the approximation gets better.
Therefore we define the area A of the region S as follows.
y=f(x)
a b
Figure 31: The smaller the width of the rectangle, the better the approximation, in general. Notice thatin this diagram we chose ci = a+ i∆x.
9.3.1 Area of a Region; Riemann Sums
The area A of the region S that lies under the graph of the continuous, non-negative function f is
It can be shown that for continuous f , this limit always exists and is independent of the choice of ci.This sum Rn is called a Riemann sum.
The value of limn→∞
[f(c1)∆x+· · ·+f(cn)∆x] is called the Riemann integral of f on [a, b], denotedb∫af(x) dx.
This definition can be carried over to functions which are not necessarily non-negative. The intuitiveinterpretation of A as an area is then no longer valid. The quantity A should then be viewed as a “signedarea”, i.e.
A = (area above x-axis, below graph) − (area below x-axis, above graph).
See Stewart, p. 306.
9.4 The Fundamental Theorem of Calculus 93
9.4 The Fundamental Theorem of Calculus
9.4.1 Theorem: Fundamental Theorem of Calculus
(a) If f is continuous on [a, b], a ≤ x ≤ b, then
A(x) =
x∫a
f(t) dt
gives a function A on [a, b] satisfying A(a) = 0 and A′(x) = f(x) for all x ∈ (a, b). Thus
d
dx
∫ x
af(t) dt = f(x)
(b) If F (x) is any antiderivative of f(x), then
F (b)− F (a) =
b∫a
f(t) dt.
Part (a) says there is some anti-derivative of f (this part is of theoretical interest). Part (b) says oncewe have (somehow) found an anti-derivative of f , we can very easily work out the value of any integralb∫af(t) dt.
9.4.2 Proof (outline)
9.4 The Fundamental Theorem of Calculus 94
9.4.3 Notation
We will write
F (x)∣∣∣ba
=[F (x)
]ba
= F (b)− F (a).
9.4.4 Properties of the Definite Integral
For some c ∈ R, f continuous on [a, b] there holds:
∫ a
bf(x) dx = −
∫ b
af(x) dx
∫ b
ac dx = c(b− a)
∫ b
a[f(x)± g(x)] dx =
∫ b
af(x) dx±
∫ b
ag(x) dx
∫ b
acf(x) dx = c
∫ b
af(x) dx
a ≤ c ≤ b ⇒∫ c
af(x) dx+
∫ b
cf(x) dx =
∫ b
af(x) dx
∫ a
af(x) dx = 0.
9.4.5 Examples
(1) Evaluate
∫ 2
1
1
tdt.
(2) Evaluate
∫ π/2
0sinx dx.
9.4.6 Example
Find the area bounded by the curves y = x2 and y = x3 on the interval [0, 1].
9.4 The Fundamental Theorem of Calculus 95
There holds x2 ≥ x3 on [0, 1],
⇒ Area =
∫ 1
0x2dx−
∫ 1
0x3dx
=
[1
3x3]10
−[
1
4x4]10
=1
3− 1
4
=1
12.
There holds x2 ≥ x3 on [0, 1],
⇒ Area =
∫ 1
0x2dx−
∫ 1
0x3dx
=
[1
3x3]10−[
1
4x4]10
=1
3−
1
4
=1
12.
9.4.7 Integral of a Positive Function
If f is non-negative and continuous on [a, b] then∫ b
af(x)dx ≥ 0.
If f is non-positive and continuous on [a, b] then∫ b
af(x)dx = −(area above graph below interval [a, b]) ≤ 0.
9.4.8 Example
Find the area enclosed by the curves y = x and y = x3.
9.5 Volume of Revolution 96
9.5 Volume of Revolution
Suppose f(x) is positive and continuous on [a, b]. By rotating the graph of f(x) above [a, b] about thex-axis, we obtain a cylindrical solid. What is the volume of such a solid? This is what we call a volumeof revolution.
a b
y=f(x)
b
y=f(x)
a
(a) (b)
Figure 32: Rotating the curve in (a) about the x-axis gives the solid in (b).
9.5.1 Formula for the volume of revolution
Suppose f(x) ≥ 0 and continuous on [a, b]. The volume of revolution of the solid obtained by rotatingthe graph y = f(x) above [a, b] about the x-axis is
9.5 Volume of Revolution 97
V = π
∫ b
a[f(x)]2dx.
V = π
∫ ba
[f(x)]2dx.
9.5.2 Outline of proof
For x ∈ [a, b], let V (x) be the volume obtained by rotating the graph of f above the interval [a, x] aboutthe x-axis. So V (a) = 0, V (b) = V . Then for h ≥ 0 small
V (x+ h)− V (x) = volume obtained by rotating graph
of f above [x, x+ h] about x-axis
' volume of cylinder with radius f(x) and height h
= π[f(x)]2h (becomes exact as h→ 0)
⇒ V (x+ h)− V (x)
h' π[f(x)]2
Taking the limit h→ 0 we obtainV ′(x) = π[f(x)]2
so the Fundamental Theorem yields∫ b
aπ[f(x)]2dx = [V (x)]ba = V (b)− V (a) = V (b) = V.
9.5.3 Example
Let a > 0. Find the volume of the solid obtained by rotating y = f(x) =√x about the x-axis over [0, a].
9.6 Improper Integrals 98
9.6 Improper Integrals
This material is covered in Stewart, pp. 567-574. Definition
The integrals
∫ ∞a
f(x)dx and
∫ b
−∞f(x)dx are called convergent if the limit exists and divergent otherwise.
9.6.1 Example
Find
∫ ∞0
e−xdx.
9.6 Improper Integrals 99
9.6.2 Note
It is still possible to define the integral for some classes of functions which are not necessarily continuous(see Stewart, p. 571).
9.6.3 Part (a) of the Fundamental Theorem
If F (u) =
u∫2
dt√1 + t2
, find F ′(u).
If G(x) =
sinx∫2
dt√1 + t2
, find G′(x).
9.7 Techniques of Integration 100
9.6.4 Example
The number of prime numbers less than N is approximated by P (N), where
P (t) =
∫ t
2
1
lnxdx.
Show that P is an increasing function of t, for t ≥ 2.
It is not possible to find a simple anti-derivative for 1lnx . This means we cannot find a simple formula for
P (t). However we can work out P ′(t) by the Fundamental Theorem of Calculus (part (a)):
P ′(t) =d
dt
∫ t
2
1
lnxdx =
1
ln t.
Since ln t > 0 for t ≥ 2, P ′(t) > 0, so P is an increasing function.Note that as t→∞, P ′(t)→ 0, so the function P is increasing more and more slowly.Numerical example: the number of prime numbers below one million: 2, 3, 5, 7, 11, . . . , 999983 is 78498.
Using a computer one can work out that
∫ 106
2
1
lnxdx ≈ 78626.5, which is pretty close to the true result.
It is not possible to find a simple anti-derivative for 1ln x
. This means we cannot find a simple formula for P (t). However we can work out P ′(t) by the
Fundamental Theorem of Calculus (part (a)):
P′(t) =
d
dt
∫ t2
1
ln xdx =
1
ln t.
Since ln t > 0 for t ≥ 2, P ′(t) > 0, so P is an increasing function.Note that as t→∞, P ′(t)→ 0, so the function P is increasing more and more slowly.Numerical example: the number of prime numbers below one million: 2, 3, 5, 7, 11, . . . , 999983 is 78498.
Using a computer one can work out that
∫ 106
2
1
ln xdx ≈ 78626.5, which is pretty close to the true result.
9.7 Techniques of Integration
This material is covered in Stewart, Section 4.5 and Chapter 7.
9.7.1 Substitution
If u = g(x) is a differentiable function whose range is an interval I and f is continuous on I, then
9.7 Techniques of Integration 101
If g′ is continuous on [a, b] and f is continuous on the range of u = g(x), then
9.7.2 Example
(Stewart, p. 341 ex. 1) Find
∫x3 cos(x4 + 2)dx.
Substitute u = x4 + 2 sodu
dx= 4x3 and x3 =
14du
dx.
⇒∫x3 cos(x4 + 2)dx =
∫1
4
du
dxcosu dx
=
∫cosu
4du
=1
4
∫cosu du
=1
4sinu+ c
=1
4sin(x4 + 2) + c.
Substitute u = x4 + 2 sodu
dx= 4x
3and x
3=
14du
dx.
⇒∫x3
cos(x4
+ 2)dx =
∫1
4
du
dxcosu dx
=
∫cosu
4du
=1
4
∫cosu du
=1
4sinu + c
=1
4sin(x
4+ 2) + c.
9.7.3 Trigonometric Substitutions
Evaluate
∫dx√a2 − x2
.
9.7 Techniques of Integration 102
Substitute x = a sin θ where a > 0 and |θ| < π/2. Then dx = a cos θdθ. Hence:∫dx√a2 − x2
=
∫a cos θdθ√a2 − a2 sin2 θ
=
∫a cos θdθ
a√
1− sin2 θ
=
∫cos θ
cos θdθ
=
∫dθ
= θ + c
= arcsin(xa
)+ c.
Substitute x = a sin θ where a > 0 and |θ| < π/2. Then dx = a cos θdθ. Hence:
∫dx√
a2 − x2=
∫a cos θdθ√
a2 − a2 sin2 θ
=
∫a cos θdθ
a√
1− sin2 θ
=
∫cos θ
cos θdθ
=
∫dθ
= θ + c
= arcsin
(x
a
)+ c.
9.7.4 Integration by Parts
Given two differentiable functions u(x) and v(x) we have a product rule for differentiation:
Therefore uv is an antiderivative of uv′ + u′v. Hence
9.7 Techniques of Integration 103
uv =
∫(u′v + uv′) dx
=
∫u′v dx+
∫uv′ dx
⇒∫uv′ dx = uv −
∫u′v dx.
uv =
∫(u′v + uv
′) dx
=
∫u′v dx +
∫uv′dx
⇒∫uv′dx = uv −
∫u′v dx.
The aim is to simplify the function that is integrated, i.e.∫u′vdx should be easier to find than
∫uv′dx.
Also note that the antiderivative of v′ needs to be found!
9.7.5 Example
Evaluate
∫x3 lnxdx.
9.7.6 Example
Evaluate
∫xexdx.
9.7 Techniques of Integration 104
Set u = x and v′ = ex. Then u′ = 1 and v = ex.
⇒∫xexdx = xex −
∫exdx
= (x− 1)ex + c.
Note that for the choice u = ex and v′ = x, the integral becomes more complicated than before.Set u = x and v′ = ex. Then u′ = 1 and v = ex.
⇒∫xexdx = xe
x −∫exdx
= (x− 1)ex
+ c.
Note that for the choice u = ex and v′ = x, the integral becomes more complicated than before.
9.7.7 Rules of Thumb
• For∫
(polynomial) · eax+bdx, choose u to be the polynomial and v′ = eax+b.
• For∫
(polynomial) · ln (ax+ b)dx, choose u = ln (ax+ b) and v′ to be the polynomial.
• Note that sometimes integration by parts needs to be applied more than once.
9.7.8 Example
Evaluate
∫x sinxdx.
Set u = x and v′ = sinx. Then u′ = 1 and v = − cosx.
⇒∫x sinxdx = −x cosx−
∫− cosxdx
= −x cosx+ sinx+ c.
Set u = x and v′ = sin x. Then u′ = 1 and v = − cos x.
⇒∫x sin xdx = −x cos x−
∫− cos xdx
= −x cos x + sin x + c.
9.9 Partial Fractions 105
9.8 Integrals Involving the ln Function
9.9 Partial Fractions
A ratio of two polynomials,
f(x) =P (x)
Q(x)
is called a rational function. Integration of a rational function is made possible by expressing it as thesum of simpler fractions, called partial fractions.
(In the three examples which follow, we will only consider rational functions for which the degree of P (x)is less than the degree of Q(x).)
9.9 Partial Fractions 106
The general process for obtaining the sum of partial fractions is as follows:
• Factor the denominator Q(x) as much as possible;
• Express f(x) as a sum of fractions each of which take either the formA
(ax+ b)ior
Ax+B
(ax2 + bx+ c)j,
where the denominators of these fractions come from the factorisation of Q(x).
Consider the following examples.
9.9.1 Example
Evaluate
∫x+ 2
x2 + xdx.
9.9 Partial Fractions 107
9.9.2 Example
Evaluate
∫dx
x2 − a2, a 6= 0.
9.9 Partial Fractions 108
9.9.3 Example
Evaluate
∫2x2 − x+ 4
x3 + 4xdx.
Given that x3 + 4x = x(x2 + 4), then
2x2 − x+ 4
x(x2 + 4)=
A
x+Bx+ C
x2 + 4.
Multiplying each side by x(x2 + 4) we get:
2x2 − x+ 4 = (A+B)x2 + Cx+ 4A,
the solution to which is A = 1, B = 1 and C = −1. Hence:∫2x2 − x+ 4
x3 + 4xdx =
∫ (1
x+
x− 1
x2 + 4
)dx
=
∫1
xdx+
∫x
x2 + 4dx−
∫1
x2 + 4dx
1. The first integral is easy: ∫1
xdx = ln |x|+ c1
2. For the second integral, set u = x2 + 4.
Thendu
dx= 2x and dx =
du
2x.
Hence: ∫x
x2 + 4dx =
∫x
u· du
2x
=1
2
∫1
udu
=1
2ln |u|+ c2 =
1
2ln∣∣x2 + 4
∣∣+ c2
Given that x3 + 4x = x(x2 + 4), then
2x2 − x + 4
x(x2 + 4)=
A
x+Bx + C
x2 + 4.
Multiplying each side by x(x2 + 4) we get:
2x2 − x + 4 = (A + B)x
2+ Cx + 4A,
the solution to which is A = 1, B = 1 and C = −1. Hence:
∫2x2 − x + 4
x3 + 4xdx =
∫ (1
x+
x− 1
x2 + 4
)dx
=
∫1
xdx +
∫x
x2 + 4dx−
∫1
x2 + 4dx
1. The first integral is easy: ∫1
xdx = ln |x| + c1
2. For the second integral, set u = x2 + 4.
Thendu
dx= 2x and dx =
du
2x.
Hence:
∫x
x2 + 4dx =
∫x
u·du
2x
=1
2
∫1
udu
=1
2ln |u| + c2 =
1
2ln∣∣∣x2 + 4
∣∣∣ + c2
9.9 Partial Fractions 109
3. For the third integral, set x = 2 tan θ. Then dx = 2 sec2 θ dθ. Hence∫1
x2 + 4dx =
∫2 sec2 θ
4 tan2 θ + 4dθ
=
∫2 sec2 θ
4 sec2 θdθ
=1
2
∫dθ
=1
2θ + c3
=1
2arctan
(x2
)+ c3
Thus ∫2x2 − x+ 4
x3 + 4xdx = ln |x|+ 1
2ln∣∣x2 + 4
∣∣− 1
2arctan
(x2
)+ c.
3. For the third integral, set x = 2 tan θ. Then dx = 2 sec2 θ dθ. Hence
∫1
x2 + 4dx =
∫2 sec2 θ
4 tan2 θ + 4dθ
=
∫2 sec2 θ
4 sec2 θdθ
=1
2
∫dθ
=1
2θ + c3
=1
2arctan
(x
2
)+ c3
Thus ∫2x2 − x + 4
x3 + 4xdx = ln |x| +
1
2ln∣∣∣x2 + 4
∣∣∣− 1
2arctan
(x
2
)+ c.
9.9 Partial Fractions 110
Notes
10.1 Row and column vectors 111
10 Vectors
It is assumed that you are familiar with vectors, vector addition, scalar multiplication and calculatingthe length of vectors. If not, please consult the review material at the end of the workbook.
10.1 Row and column vectors
We may write vectors using columns eg.
(ab
), or as row vectors
(a b
). We usually use column
vectors in this course.
If P , Q are points,−→PQ denotes the vector from P to Q. Vectors will be indicated by bold lowercase
letters v, w etc. Writing by hand you may use v or ~v or v.
The norm or length of v is written ‖v‖, so
∥∥∥∥( v1v2
)∥∥∥∥ =√v21 + v22
∥∥∥∥∥∥ v1
v2v3
∥∥∥∥∥∥ =√v21 + v22 + v23.
Denote 2 and 3 dimensional space by R2 and R3, respectively. Thus
R2 =
{(xy
)| x, y ∈ R
}and R3 =
x
yz
| x, y, z ∈ R
.
Let
i =
(10
)j =
(01
)in R2 i =
100
j =
010
k =
001
in R3.
For any vector v =
(v1v2
)∈ R2 we have
v =
(v10
)+
(0v2
)= v1i + v2j,
and for any vector v =
v1v2v3
∈ R3
v = v1i + v2j + v3k.
10.1 Row and column vectors 112
10.1.1 Review Example
An albatross is flying NE at 20 km/h into a 10 km/h wind in direction E60◦S. Find the speed anddirection of the bird relative to the ground.
θ
v
o60
10
o45
20
Figure 33: v gives the velocity vector of the albatross.
10.1 Row and column vectors 113
We have:
z = 20 cos 45◦i + 20 sin 45◦j
= 10√
2i + 10√
2j;
w = 10 cos 60◦i− 10 sin 60◦j
= 5i− 5√
3j.
So v = z + w = (10√
2 + 5)i + (10√
2− 5√
3)jTherefore the magnitude of v is
‖v‖ =
√(10√
2 + 5)2 + (10√
2− 5√
3)2
=
√500 + 100
√2(1−
√3)
≈ 19.91.
To calculate the angle θ we use
tan θ =10√
2− 5√
3
10√
2 + 5
⇒ θ ≈ 16◦
So v = 19.9kmh−1 at E16◦N. We have:
z = 20 cos 45◦i + 20 sin 45
◦j
= 10√
2i + 10√
2j;
w = 10 cos 60◦i− 10 sin 60
◦j
= 5i− 5√
3j.
So v = z + w = (10√
2 + 5)i + (10√
2− 5√
3)jTherefore the magnitude of v is
‖v‖ =
√(10√
2 + 5)2 + (10√
2− 5√
3)2
=
√500 + 100
√2(1−
√3)
≈ 19.91.
To calculate the angle θ we use
tan θ =10√
2− 5√
3
10√
2 + 5
⇒ θ ≈ 16◦
So v = 19.9kmh−1 at E16◦N.
10.2 Dot Product 114
10.2 Dot Product
For non zero vectors v =−→OP , w =
−→OQ the angle between v and w is the angle θ with 0 ≤ θ ≤ π radians
between−→OP and
−→OQ at the origin O; see Figure 10.2.
x
y
θ
vw
Figure 34: θ is the angle between v and w.
The dot (or scalar or inner) product of vectors v and w, denoted by v ·w, is the number given by
v ·w =
0 , if v or w = 0
‖v‖ · ‖w‖ cos θ , otherwise
where θ is the angle between v and w.
If v, w 6= 0 and v ·w = 0 then v and w are said to be orthogonal or perpendicular.
If v = v1i + v2j + v3k and w = w1i + w2j + w3k are two vectors, then v ·w is given by:
v ·w = v1w1 + v2w2 + v3w3,
In particular, for v ∈ R3,‖v‖2 = v · v = v21 + v22 + v23.
It is assumed that you are familiar with properties of dot product. These properties are listed (in a moregeneral setting) at the end of this chapter.
10.2.1 Example
If P = (2, 4,−1), Q = (1, 1, 1), R = (−2, 2, 3), find the angle θ = PQR.
10.3 The Projection Formula 115
Find vectors joining Q to P , and Q to R:
−→QP =
24−1
− 1
11
=
13−2
,
−→QR =
−223
− 1
11
=
−312
.
∴
∥∥∥∥−→QP∥∥∥∥ =
∥∥∥∥−→QR∥∥∥∥ =√
1 + 9 + 4 =√
14.
−→QP ·
−→QR =
13−2
· −3
12
= −3 + 3− 4 = −4
∴ cos θ =
−→QP ·
−→QR∥∥∥∥−→QP∥∥∥∥ · ∥∥∥∥−→QR∥∥∥∥ =
−4
14⇒ θ ' 107◦.
Find vectors joining Q to P , and Q to R:
−→QP =
24−1
− 1
11
=
13−2
,
−→QR =
−223
− 1
11
=
−312
.
∴
∥∥∥∥−→QP∥∥∥∥ =
∥∥∥∥−→QR∥∥∥∥ =√
1 + 9 + 4 =√
14.
−→QP ·
−→QR =
13−2
· −3
12
= −3 + 3− 4 = −4
∴ cos θ =
−→QP ·
−→QR∥∥∥∥−→QP∥∥∥∥ · ∥∥∥∥−→QR∥∥∥∥ =
−4
14⇒ θ ' 107
◦.
10.3 The Projection Formula
Fix a vector v. Given another vector w we can write w as
w = w1 + w2
where:
• w1 is in the direction of v
• w2 is perpendicular to v.
See Figure 35. We want to find w1 and w2 in terms of v and w.
Since w1 is in the direction of v, let
w1 = αv for some α ∈ R.
Then w2 = w − αv.
10.3 The Projection Formula 116
y
x
2
1w
w
w
v
Figure 35: w can be decomposed into a component w1 in the direction of v and a component w2
perpendicular to v.
We need to choose α to make v and w2 orthogonal.
0 = w2 · v = (w − αv) · v = w · v − αv · v = w · v − α‖v‖2.
So we needα =
w · v‖v‖2
.
We have derived the projection formula:
w1 =(w · v)
‖v‖2v , w2 = w − (w · v)
‖v‖2v. (10.1)
10.3.1 Example
Find the projection of w =
(62
)onto v =
(66
).
w1 =w · v‖v‖2
v =36 + 12
62 + 62
(66
)=
2
3
(66
)=
(44
).
w2 = w −w1 =
(62
)−(
44
)=
(2−2.
)w1 =
w · v‖v‖2
v =36 + 12
62 + 62
(66
)=
2
3
(66
)=
(44
).
w2 = w −w1 =
(62
)−(
44
)=
(2−2.
)
10.4 Vectors in Rn 117
10.4 Vectors in Rn
We are familiar with vectors in two and three dimensional space, R2 and R3. These generalize to ndimensional space, denoted Rn. A vector v in Rn is specified by n components:
v =
v1v2...vn
vi ∈ R.
We define addition of vectors and multiplication by a scalar component-wise. Thus if α ∈ R is a scalar
and w =
w1
w2...wn
is another vector then
v + w =
v1 + w1
v2 + w2...
vn + wn
= w + v, αv =
αv1αv2
...αvn
α ∈ R,
We define the dot (or scalar) product between v and w as
v ·w = v1w1 + v2w2 + · · · · · ·+ vnwn
=
n∑k=1
vkwk.
v ·w = v1w1 + v2w2 + · · · · · · + vnwn
=
n∑k=1
vkwk.
10.4 Vectors in Rn 118
The length or norm of the vector v is
‖v‖ =√v · v
=√v21 + v22 + · · ·+ v2n
‖v‖ =√v · v
=√v21 + v22 + · · · + v2n
Every vector v in Rn is expressible in terms of n special vectors e1, . . . , en called coordinate vectors,where
ei =
0...010...0
← ith position.
Proof:
An arbitrary vector v ∈ Rn can be written as
v =
v1v2...vn
=
v100...0
+
0v20...0
+ · · · · · ·+
00...0vn
= v1e1 + v2e2 + · · · · · ·+ vnen.
An arbitrary vector v ∈ Rn can be written as
v =
v1v2
.
.
.vn
=
v100
.
.
.0
+
0v20
.
.
.0
+ · · · · · · +
00
.
.
.0vn
= v1e1 + v2e2 + · · · · · · + vnen.
10.5 Properties of Dot Product 119
10.4.1 Notation
(1) E.g. in R3 we havei = e1 , j = e2 , k = e3.
(2) We have also the zero vector
0 =
00...0
satisfying
0 + v = v + 0 = v, 0 · v = 0, for every vector v.
(3) If v 6= 0, w 6= 0, we say that vectors v, w are perpendicular (or orthogonal) if
v ·w = 0.
10.5 Properties of Dot Product
If u, v and w ∈ Rn and α ∈ R then
(1) u · v = v · u.
(2) v · (w + u) = v ·w + v · u.
(3) (αv) ·w = α(v ·w) = v · (αw).
(4) v · v ≥ 0 and v · v = 0 if and only if v = 0.
10.5 Properties of Dot Product 120
Notes
11.2 The effect of matrix multiplication on vectors 121
11 Matrices and Linear Transformations
11.1 2× 2 matrices
Recall that a 2× 2 array of numbers
A =
(a bc d
)is called a 2 × 2 matrix (plural matrices). Suppose v =
(xy
)is a 2 element column vector. Then
the product Av is defined to be the 2 element column vector whose entries are given by taking the dotproduct of each row of A with the vector v:
Av =
(a bc d
)(xy
)=
(ax+ bycx+ dy
).
11.1.1 Example
Let A =
(1 23 4
), and let v =
(56
). Calculate Av.
Av =
(1 23 4
)(56
)
=
(1 · 5 + 2 · 63 · 5 + 4 · 6
)
=
(1739
).
Av =
(1 23 4
)(56
)
=
(1 · 5 + 2 · 63 · 5 + 4 · 6
)
=
(1739
).
11.2 The effect of matrix multiplication on vectors
We can visualize geometrically what happens when we multiply a 2× 2 matrix by a vector.
• Multiplying 2× 2 matrix A by i and j gives
Ai =
(a bc d
)(10
)=
(ac
)and
Aj =
(a bc d
)(01
)=
(bd
).
Ai is the 1st column of A, and Aj is the 2nd column of A.
This will be used again many times in the course.
11.3 Composing Transformations & Matrix Multiplication 122
11.2.1 Example
Consider the matrix A =
(2 11 2
). Thus Ai =
(21
)and Aj =
(12
).
A maps i 7→(
21
)and j 7→
(12
).
In this way 2× 2 matrices can be used to manipulate position vectors in the plane. This has applicationin computer graphics. We often want to transform a set of points forming a figure on the screen. Egrescaling the figure, or rotating it about a central point etc. Such transformations can be accomplishedby multiplying by the appropriate matrix.
It can be shown that multiplying by a 2× 2 matrix in this way always maps collinear points to collinearpoints. For this reason a matrix is said to represent a linear transformation.
11.3 Composing Transformations & Matrix Multiplication
Suppose we perform two linear transformations, one after the other. Suppose the first transformation we
perform is given by the matrix B =
(s tu v
), and the second is given by the matrix A =
(a bc d
). A
vector v =
(xy
)is first sent to
Bv =
(s tu v
)(xy
)=
(sx+ tyux+ vy
).
The resulting vector Bv is then acted on by A, giving
A
(sx+ tyux+ vy
)=
(a bc d
)(sx+ tyux+ vy
)=
(a(sx+ ty) + b(ux+ vy)c(sx+ ty) + d(ux+ vy)
)=
((as+ bu)x+ (at+ bv)y(cs+ du)x+ (ct+ dv)y
)=
(as+ bu at+ bvcs+ du ct+ dv
)(xy
).
We define the product AB to be the 2× 2 matrix
AB =
(a bc d
)(s tu v
)=
(as+ bu at+ bvcs+ du ct+ dv
).
The effect of performing the transformation represented by B and then the transformation representedby A is to perform the transformation represented by AB. In symbols
A(Bv) = (AB)v.
Note that the matrixAB represents the transformation of first applyingB, and then applyingA. Comparecomposition of functions F ◦G.
Linear transformations in three dimensions can be accomplished using larger matrices.
11.6 Addition 123
11.4 Definition: Matrix
A rectangular array of numbers
A =
a11 a12 · · · · · · a1na21 a22 · · · · · · a2n...
...am1 am2 · · · · · · amn
is called an m × n matrix. It is made up of m rows and n columns. Entries of the jth column may beassembled into a vector
a1ja2j...amj
called the jth column vector of A. Similarly the ith row vector of A is
(ai1 ai2 · · · · · · ain).
Note that an m× 1 matrix is a column vector and a 1×n matrix is a row vector. The number aij in theith row and j column of A is called the (i, j)th entry of A. For brevity we write A = (aij).
11.4.1 Example
The 2× 3 matrix A with entries aij = i− j is
A =
(0 −1 −21 0 −1
).
A =
(0 −1 −21 0 −1
).
11.4.2 Example
The m × n matrix A whose entries are all zero is called the zero matrix, denoted 0; eg. the zero 3 × 2matrix is
0 =
0 00 00 0
.
11.5 Equality
Matrices A, B are said to be equal, written A = B, if they are the same size and aij = bij , for all i, j.
11.6 Addition
The sum of two m× n matrices A and B is defined to be the m× n matrix A+B with entries
(A+B)ij = aij + bij
We add matrices element wise, as for vectors.
Addition is only defined between matrices of the same size.
11.8 Matrix Multiplication 124
11.6.1 Properties
For matrices A, B, C of the same size we have
A+B = B +A
(A+B) + C = A+ (B + C).
11.7 Scalar Multiplication
Let A be an m× n matrix and α ∈ R. We define αA to be the m× n matrix with entries
(αA)ij = α · aij , for all i, j.
Write −A for (−1) ·A. Define subtraction between matrices of the same size by
A−B = A+ (−B).
11.7.1 Example
Find A+B and A−B if
A =
(−4 6 3
0 1 2
), B =
(5 −1 03 1 0
).
A+B =
(1 5 33 2 2
), A−B =
(−9 7 3−3 0 2
).
A + B =
(1 5 33 2 2
), A− B =
(−9 7 3−3 0 2
).
11.7.2 Example
If A =
(6 02 −1
)then A+A =
(12 04 −2
)= 2A.
11.8 Matrix Multiplication
We generalize the multiplication of 2 × 2 matrices. Let A = (aij) be an m × n and B = (bjk) an n × rmatrix. Then AB is the m× r matrix with ik entry
(ab)ik = ai1b1k + ai2b2k + · · · · · ·+ ainbnk :
a11 a12 · · · · · · a1n...
......
ai1 ai2 · · · · · · ain...
......
am1 am2 · · · · · · amn
A
(m× n)
b11 · · · · b1k · · · · b1rb21 · · · · b2k · · · · b2r...
......
...bn1 · · · · bnk · · · · bnr
B
(n× r)
= AB(m× r)
where (ab)ik = dot product between ith row of A and kth column of B.
AB is only defined if the number of columns of A = the number of rows of B.
11.8 Matrix Multiplication 125
11.8.1 Example
• Let A = (−1 0 1)1× 3
and B =
0 −11 2−2 5
3× 2
• Then AB is defined, but B3×2
A1×3
is not defined.
Calculate AB.
AB = (−1 0 1)
0 −11 2−2 5
= (−1 · 0 + 0 · 1 + 1 · −2 − 1 · −1 + 0 · 2 + 1 · 5)
= (−2 6)
AB = (−1 0 1)
0 −11 2−2 5
= (−1 · 0 + 0 · 1 + 1 · −2 − 1 · −1 + 0 · 2 + 1 · 5)
= (−2 6)
11.8.2 Example
A = (1 3 9) , B =
217
.
Calculate AB and BA.
(1 3 9)1×3
217
3×1
= 2 + 3 + 63 = 68
217
3×1
(1 3 9
)1×3
=
2 6 181 3 97 21 63
.
(1 3 9)1×3
217
3×1
= 2 + 3 + 63 = 68
217
3×1
(1 3 9
)1×3
=
2 6 181 3 97 21 63
.
11.9 Transposition 126
11.8.3 Properties
For matrices of appropriate size
(1) (AB)C = A(BC), Associativity
(2) (A+B)C = AC +BC, A(B + C) = AB +AC Distributive Laws
Another unusual property of matrices is that AB = 0 does not imply A = 0 or B = 0. It is possible forthe product of two non-zero matrices to be zero:(
1 12 2
) (−1 1
1 −1
)=
(0 00 0
)
Also, if AC = BC, or CA = CB then it is not true in general that A = B. (If AC = BC, thenAC −BC = 0 and (A−B)C = 0, but this does not imply A−B = 0 or C = 0.)
11.9 Transposition
The transpose of an m× n matrix A = (aij) is the n×m matrix AT with entries
aji = aTij , for all i, j
ie. a11 a12 · · · · · · a1na21 a22 · · · · · · a2n...
...am1 am2 · · · · · · amn
T
=
a11 a21 · · · · · · am1
a12 a22 · · · · · · am2...
...a1n a2n · · · · · · amn
So the row vectors of A become column vectors of AT and vice versa.
11.9.1 Examples of transpose
If A =
(5 −8 14 0 0
), B =
(1 −12 0
), C = (7 5 − 2) then find AT , BT , CT .
AT =
(5 −8 14 0 0
)T=
5 4−8 0
1 0
BT =
(1 −12 0
)T=
(1 2−1 0
)
CT = (7 5 − 2)T =
75−2
.AT
=
(5 −8 14 0 0
)T=
5 4−8 0
1 0
BT
=
(1 −12 0
)T=
(1 2−1 0
)
CT
= (7 5 − 2)T
=
75−2
.
11.9 Transposition 127
11.9.2 Properties
For matrices of appropriate size
(1) (αA)T = α ·AT , α ∈ R
(2) (A+B)T = AT +BT
(3) (AT )T = A
(4) (AB)T = BTAT (not ATBT !).
11.9.3 Dot product expressed as matrix multiplication
A column vectorv =
v1v2...vn
∈ Rn
may be interpreted as an n× 1 matrix. Then the dot product (for two column vectors v and w) may beexpressed using matrix multiplication:
v ·w = v1w1 + v2w2 + · · ·+ vnwn = (v1 v2 · · · vn)1× n
w1
w2...wn
n× 1
= vTw.
Also
v · v = v21 + v22 + · · ·+ v2n = ‖v‖2
= vTv.
11.9 Transposition 128
Notes
12.2 Gaussian elimination 129
12 Gaussian elimination
12.1 Simultaneous equations
One of the main applications of matrices is solving simultaneous equations of the form
a11x1 + a12x2 + · · · + a1nxn = b1a21x1 + a22x2 + · · · + a2nxn = b2
......
am1x1 + am2x2 + · · · + amnxn = bm.
Here x1, x2, · · · · · · , xn are the variables (unknowns) and the coefficients aij and bi are given real numbers.
This set of equations can be written using matrix multiplication as
Ax = b (12.1)
where x =
x1x2...xn
is a column vector of unknowns, b =
b1b2...bm
is a given m × 1 matrix (column
vector) and A = (aij) is the m× n matrix of coefficients.
An efficient way of solving a system is by Gaussian elimination.
12.2 Gaussian elimination
We illustrate the process of Gaussian elimination. Given a set of equations, the idea is to performoperations, called elementary row operations, in order to simplify the equations to the point where thesolution may be obtained easily.
Solve the following system of equations:
x1 + 2x2 − 3x3 = −1−3x1 − 4x2 + 8x3 = 3
2x1 + 5x2 − 8x3 = −5
12.2 Gaussian elimination 130
Add 3 times the first equation to the second equation:
x1 + 2x2 − 3x3 = −12x2 − x3 = 0
2x1 + 5x2 − 8x3 = −5
Subtract 2 times the first equation from the third equation:
x1 + 2x2 − 3x3 = −12x2 − x3 = 0x2 − 2x3 = −3
Swap the order of equations 2 and 3:
x1 + 2x2 − 3x3 = −1x2 − 2x3 = −3
2x2 − x3 = 0
Add −2 times the second equation to the third equation.
x1 + 2x2 − 3x3 = −1x2 − 2x3 = −3
3x3 = 6
Multiply the third equation by 1/3:
x1 + 2x2 − 3x3 = −1x2 − 2x3 = −3
x3 = 2
This is called the Gauss reduced form or row echelon form of the system. Now we can solve by backsubstitution:The third equation is x3 = 2.
Substitute x3 = 2 into the second equation x2 − 2x3 = −3 so x2 − 4 = −3 so x2 = 1.
Substitute x2 and x1 into the first equation x1 + 2x2 − 3x3 = −1 so x1 + 2 − 6 = −1 so x1 = 3. Add 3
times the first equation to the second equation:x1 + 2x2 − 3x3 = −1
2x2 − x3 = 02x1 + 5x2 − 8x3 = −5
Subtract 2 times the first equation from the third equation:
x1 + 2x2 − 3x3 = −12x2 − x3 = 0x2 − 2x3 = −3
Swap the order of equations 2 and 3:x1 + 2x2 − 3x3 = −1
x2 − 2x3 = −32x2 − x3 = 0
Add −2 times the second equation to the third equation.
x1 + 2x2 − 3x3 = −1x2 − 2x3 = −3
3x3 = 6
Multiply the third equation by 1/3:x1 + 2x2 − 3x3 = −1
x2 − 2x3 = −3x3 = 2
This is called the Gauss reduced form or row echelon form of the system. Now we can solve by back substitution:
The third equation is x3 = 2.
Substitute x3 = 2 into the second equation x2 − 2x3 = −3 so x2 − 4 = −3 so x2 = 1.
Substitute x2 and x1 into the first equation x1 + 2x2 − 3x3 = −1 so x1 + 2− 6 = −1 so x1 = 3.
12.2 Gaussian elimination 131
This process is called Gaussian elimination. We call this simplified set of equations
x1 + 2x2 − 3x3 = −1x2 − 2x3 = −3
x3 = 2
the reduced form of the equations. Once in reduced form the set of equations is easily solved by backsubstitution.
In this example we repeatedly performed the following operations:
(1) Interchange the order of two equations: swap the ith and j equations.
(2) Multiply an equation through by a non-zero number α.
(3) Add α times the jth equation to the ith equation.
Performing these operations does not change the solution of the equations. But eventually we obtainedsuch a simple system that we could easily find the solution.
12.2.1 Gaussian elimination using matrices
The above procedure may be described most conveniently in terms of matrices. The information presentin a set of simultaneous equations
Am×n
xn×1
= bm×1
may be summarized by constructing a new matrix
(A | b)
A with one extra column, b. This is called the augmented matrix. Instead of manipulating equations wemanipulate the corresponding augmented matrix. This way we keep writing to a minimum.
The augmented matrix for the set of equations in the previous example 12.2 may be written
1 2 −3 −1−3 −4 8 3
2 5 −8 −5
1 2 −3 −1−3 −4 8 3
2 5 −8 −5
Always keep in mind that each row of the augmented matrix represents an equation. The last column isthe RHS. The other columns are the coefficients of the variables. Eg the last row above corresponds to2x1 + 5x2 − 8x3 = −5.
Another advantage of the augmented matrix notation is that if we want to solve two sets of equationswith the same coefficient matrix but different RHS:
Ax = b, and Ay = c,
we can do both at once by forming the augmented matrix
(A | b | c).
See later.
12.2 Gaussian elimination 132
12.2.2 Elementary row operations (EROs)
We rewrite the three allowable operations on equations discussed above, in terms of their effect onthe augmented matrix. Call the row vectors of the augmented matrix r1, r2, . . . etc. Since each rowcorresponds to an equation, the operations are:
(1) Interchange rows i and j, written ri ←→ rj .
(2) Replace ri with αri, α ∈ R, α 6= 0, written ri αri.
(3) Replace ri with ri + αrj , α ∈ R, where rj is another row of A, written ri ri + αrj .
These three operations are known as elementary row operations (EROs). In general we say that m × nmatrices A, B are row equivalent, written A ∼ B, if and only if they are related by a series of EROs.
To solve a system of equations:
• Set up the augmented matrix (A | b) representing the system.
• Apply EROs to (A | b) to reduce it to a simpler matrix.
• Eventually the system is so simple that the solution may be read off.
This is one of the most efficient procedures for solving linear equations.
12.2.3 Example
Redo the previous example 12.2, using row reductions. You should check that we are doing exactly thesame calculations as before, just with less writing.
12.2 Gaussian elimination 133
The augmented matrix is: 1 2 −3 −1−3 −4 8 3
2 5 −8 −5
Add 3 times row 1 to row 2: r2 r2 + 3r1. 1 2 −3 −1
0 2 −1 02 5 −8 −5
Subtract 2 times row 1 from row 3: r3 r3 − 2r1. 1 2 −3 −1
0 2 −1 00 1 −2 −3
The first column is done. Now we want a 1 in the (2, 2) position. Swap rows 2 and 3: r2 ↔ r3. 1 2 −3 −1
0 1 −2 −30 2 −1 0
Add −2 times row 2 to row 3: r3 r3 − 2r2. 1 2 −3 −1
0 1 −2 −30 0 3 6
Multiply row 3 by 1/3: r3
13r3. 1 2 −3 −1
0 1 −2 −30 0 1 2
This is called the Gauss reduced form or row echelon form of the system. Now we can solve by backsubstitution:The third row is the equation x3 = 2.
Substitute x3 = 2 into the second equation x2 − 2x3 = −3 so x2 − 4 = −3 so x2 = 1.
Substitute x2 and x1 into the first equation x1 + 2x2 − 3x3 = −1 so x1 + 2− 6 = −1 so x1 = 3.
The solution is x =
x1x2x3
=
312
. The augmented matrix is:
1 2 −3 −1−3 −4 8 3
2 5 −8 −5
Add 3 times row 1 to row 2: r2 r2 + 3r1. 1 2 −3 −1
0 2 −1 02 5 −8 −5
Subtract 2 times row 1 from row 3: r3 r3 − 2r1. 1 2 −3 −1
0 2 −1 00 1 −2 −3
The first column is done. Now we want a 1 in the (2, 2) position. Swap rows 2 and 3: r2 ↔ r3.
1 2 −3 −10 1 −2 −30 2 −1 0
Add −2 times row 2 to row 3: r3 r3 − 2r2. 1 2 −3 −1
0 1 −2 −30 0 3 6
Multiply row 3 by 1/3: r3
13r3. 1 2 −3 −1
0 1 −2 −30 0 1 2
This is called the Gauss reduced form or row echelon form of the system. Now we can solve by back substitution:
The third row is the equation x3 = 2.
Substitute x3 = 2 into the second equation x2 − 2x3 = −3 so x2 − 4 = −3 so x2 = 1.
Substitute x2 and x1 into the first equation x1 + 2x2 − 3x3 = −1 so x1 + 2− 6 = −1 so x1 = 3.
The solution is x =
x1x2x3
=
312
.
12.2 Gaussian elimination 134
12.2.4 The Gaussian algorithm
To solveAm×n
xn×1
= bm×1
,
first set up the augmented matrix(A | b) .
Then
- Find a row with entry α 6= 0 in column 1. (If none go to column 2)
- Multiply row by 1α to make leading entry 1
- Swap rows to bring the 1 to the top 1 • • •∗ • • •∗ • • •...
- Subtract appropriate multiples of row 1 to clear out the first column
1 • • •0 • • •0 • • •...
First column is now done.
Repeat on second column:
Find a row (not the first) with entry α 6= 0 in column 2. (If none exists, go on to third column). Makethe leading entry 1, and clear out all the entries below this 1:
1 • • •0 1 • •0 0 • •...
Then go on to the third column etc.
12.2.5 Gauss reduced form or row echelon form
When solving a system Ax = b, we can stop doing EROs on (A | b) once A is “simple enough” to findthe solution by back substitution. This means we reduce (A | b) until:
• All rows consisting entirely of 0’s are at the bottom.
• The first non-zero entry (leading entry or pivot) of each row is 1, and appears in a column to theright of the leading entry of the row above it.
• All entries in a column below a leading entry are zeros.
12.2 Gaussian elimination 135
The reduced matrix is called the Gauss reduced form (or row echelon form) of A. For example:
A ∼ · · · ∼ G =
1 • • • •0 1 • • •0 0 0 1 •0 0 0 0 0
∣∣∣∣∣∣∣∣••••
The leading entries 1 (pivots) must be 1 and must have zeros below them. Not every column must havea pivot. Above, the third column has no pivot.
The leading entries must form a “stair case” from top left to bottom right.
(Note that a matrix A may have more than one Gauss reduced form. Eg the entries marked • above arenot unique. It doesn’t matter which Gauss reduced form is used.) Tips:
• Try to do row reductions that lead to the easiest algebra.
• Try to avoid fractions.
• Use row swaps to take advantage of existing 1’s in the matrix.
We will look at the Gaussian Algorithm in a MATLAB session.
12.2.6 Rank of a Matrix
The number of pivots 1 in the row echelon form of a matrix A is called its rank, denoted rank A. Thematrices G (and hence A) above have rank 3.
12.2.7 Example
Which of the following matrices are in Gauss reduced form (row echelon form)? For those in row echelonform, determine the rank.
A =
1 2 3 40 0 1 40 0 0 1
, B =
0 1 0 20 0 1 −20 0 0 0
, C =
0 1 0 21 0 0 00 0 1 0
A, B are row-reduced. C is not because the leading 1 in the second column is not to the right of theleading 1 in the 1st column.rank A = 3. rank B = 2. A, B are row-reduced. C is not because the leading 1 in the second column is not to the right of the leading 1 inthe 1st column.
rank A = 3. rank B = 2.
12.2.8 Inconsistent Equations
Not every system of linear equations has a solution. Consider the following example.
Solvex1 + 2x2 − x3 = 42x1 + 7x2 + x3 = 143x1 + 8x2 − x3 = 17
12.2 Gaussian elimination 136
The augmented matrix is 1 2 −1 42 7 1 143 8 −1 17
∼
1 2 −1 40 3 3 60 2 2 5
(r2 r2 − 2r1)(r3 r3 − 3r1)
∼
1 2 −1 40 1 1 20 2 2 5
(r2 13r2)
∼
1 2 −1 40 1 1 20 0 0 1
(r3 r3 − 2r2)
This is in row echelon form. The bottom line corresponds to:
0 · x1 + 0 · x2 + 0 · x3 = 1,
which is impossible. Therefore there are no solutions to the system. The augmented matrix is
1 2 −1 42 7 1 143 8 −1 17
∼
1 2 −1 40 3 3 60 2 2 5
(r2 r2 − 2r1)(r3 r3 − 3r1)
∼
1 2 −1 40 1 1 20 2 2 5
(r2 13r2)
∼
1 2 −1 40 1 1 20 0 0 1
(r3 r3 − 2r2)
This is in row echelon form. The bottom line corresponds to:
0 · x1 + 0 · x2 + 0 · x3 = 1,
which is impossible. Therefore there are no solutions to the system.
12.2.9 How to recognize an inconsistent system
The Gauss reduced form of the augmented matrix will have one or more rows of the form (0 0 . . . 0 | a)where a 6= 0. This leads to an impossible equation 0x1 + 0x2 + · · ·+ 0xn = a 6= 0.
Geometrically, a linear equation in R2 represents a line. An equation in R3 represents a plane etc. Solvingtwo equations in R2 thus corresponds to finding the intersection of two lines in R2. Usually two lines inR2 have a unique point of intersection. But not always: parallel lines do not.
In fact parallel lines do not intersect at all, so there are no solutions—unless the lines are right on top ofeach other, in which case they intersect in infinitely many points.
Solving three equations in R3 corresponds to finding the intersection of three planes. Usually threeplanes have a unique point of intersection, but not always. Similarly for equations in more variables.This underlies the next two examples.
12.2 Gaussian elimination 137
Three planes meeting in a point
Three planes meeting in a line
12.2.10 More than one solution
A system of equations may have more than one solution.
Solve the system of equationsx1 + 3x2 − 2x3 = 5−x1 + x2 − 2x3 = −92x1 + 4x2 − 2x3 = 12
12.2 Gaussian elimination 138
The augmented matrix is: 1 3 −2 5−1 1 −2 −92 4 −2 12
∼
1 3 −2 50 4 −4 −40 −2 2 2
(r2 r2 + r1)(r3 r3 − 2r1)
∼ (r2 1
4r2)
1 3 −2 50 1 −1 −10 −2 2 2
∼
1 3 −2 50 1 −1 −10 0 0 0
(r3 r3+2r2)
(Gauss reduced form).The last row is equivalent to
0 · x1 + 0 · x2 + 0 · x3 = 0, which gives no information.
Therefore we effectively have just 2 equations in 3 unknowns
x1 + 3x2 − 2x3 = 5
x2 − x3 = −1
No leading 1 in the third column. So set x3 = α arbitrary, and back substitute into r2 and r1 to give
x2 = −1 + x3 = −1 + α ,
x1 = 5− 3x2 + 2x3 = 8− 3α+ 2α = 8− α
This means, for any choice of α ∈ R, a solution is:
x =
x1x2x3
=
8− α−1 + αα
=
8−10
+ α
−111
.
The last equation is the equation of a line in R3: x is given by a position vector p =
−810
plus α times a
vector v =
−111
indicating direction.
Thus geometrically the three planes intersect along a common line rather than at a single point. Every solution xcorresponds to a point on this line obtained by a particular choice of α. The augmented matrix is:
1 3 −2 5−1 1 −2 −92 4 −2 12
∼
1 3 −2 50 4 −4 −40 −2 2 2
(r2 r2 + r1)(r3 r3 − 2r1)
∼ (r2 1
4r2)
1 3 −2 50 1 −1 −10 −2 2 2
∼
1 3 −2 50 1 −1 −10 0 0 0
(r3 r3+2r2)
(Gauss reduced form).The last row is equivalent to
0 · x1 + 0 · x2 + 0 · x3 = 0, which gives no information.
Therefore we effectively have just 2 equations in 3 unknowns
x1 + 3x2 − 2x3 = 5
x2 − x3 = −1
No leading 1 in the third column. So set x3 = α arbitrary, and back substitute into r2 and r1 to give
x2 = −1 + x3 = −1 + α ,
x1 = 5− 3x2 + 2x3 = 8− 3α+ 2α = 8− α
This means, for any choice of α ∈ R, a solution is:
x =
x1x2x3
=
8− α−1 + αα
=
8−10
+ α
−111
.
The last equation is the equation of a line in R3: x is given by a position vector p =
−810
plus α times a
vector v =
−111
indicating direction.
Thus geometrically the three planes intersect along a common line rather than at a single point. Every solution xcorresponds to a point on this line obtained by a particular choice of α.
12.3 The general solution 139
12.2.11 How to recognize non-unique solutions of Ax = b
The system is consistent, and the Gauss reduced matrix will have one or more columns without a pivot(leading entry). That is, the rank of A is less than the number of columns of A.
For example, if the second column does not have a leading 1 then there will never be an equation we canuse to isolate x2, so x2 will be a free variable: we will be able to assign it any value (α say), and stillsolve the system. In a big system we may have several free variables.
Any system with more than one solution will have a free variable. Since this variable can take any value,the system will then have infinitely many solutions.
12.3 The general solution
12.3.1 General form of solutions
In general a system of equationsAm×n
xn×1
= bm×1
will have one of the following
• A unique solution
• An infinite number of solutions
• No solution.
Which behaviour occurs can be seen from the Gauss reduced form of the matrix.
We shall show that if there are an infinite number of solutions, they will all be of the form p + y wherep is a particular solution and y is the part arising from the free variables. We shall show that y is asolution of Ax = 0.
It is useful to introduce some notation.
12.3.2 The Nullspace of a matrix
Let A be a matrix. The set of vectors x with Ax = 0 is called the nullspace of A, denoted NS(A).
The nullspace always contains the zero vector, since A0 = 0. We shall show later that the nullspaceeither contains only the zero vector, or else contains infinitely many vectors.
The linear system Ax = 0 is called a homogeneous system. (Homogeneous means “all the same”.) Thusthe nullspace of A is the set of solutions of the homogeneous system.
A linear system Ax = b with b 6= 0 is called inhomogeneous.
12.3.3 Example
Let A =
1 3 −2−1 1 −22 4 −2
and let b =
5−912
.
(a) Calculate NS(A). (b) Solve the system Ax = b.
12.3 The general solution 140
For the nullspace we have to solve Ax = 0. 1 3 −2 0−1 1 −2 02 4 −2 0
∼
1 3 −2 00 4 −4 00 −2 2 0
(r2 r2 + r1)(r3 r3 − 2r1)
∼ (r2 r24
)
1 3 −2 00 1 −1 00 −2 2 0
∼
1 3 −2 00 1 −1 00 0 0 0
(r3 r3+2r2)
(Gauss reduced form).Although we have a system of 3 equations in 3 unknowns, the rank of this matrix is 2, not 3; the bottomrow gives no information. So x3 can be anything. Say x3 = α. The middle row says x2 − x3 = 0 sox2 = α. The top row gives x1 + 3x2 − 2x3 = 0, so x1 + 3α− 2α = 0 so x1 = −α.
NS(A) =
α −1
11
| α ∈ R
.
Inhomogeneous case Ax = b. We did this in the previous example.We found
x =
x1x2x3
=
8− α−1 + αα
=
8−10
+ α
−111
= p + y,
where p =
8−10
is a particular solution of Ax = b and y is any element in the nullspace NS(A). For
the nullspace we have to solve Ax = 0. 1 3 −2 0−1 1 −2 02 4 −2 0
∼
1 3 −2 00 4 −4 00 −2 2 0
(r2 r2 + r1)(r3 r3 − 2r1)
∼ (r2 r2
4)
1 3 −2 00 1 −1 00 −2 2 0
∼
1 3 −2 00 1 −1 00 0 0 0
(r3 r3+2r2)
(Gauss reduced form).Although we have a system of 3 equations in 3 unknowns, the rank of this matrix is 2, not 3; the bottom row gives no information. So x3 can be anything.Say x3 = α. The middle row says x2 − x3 = 0 so x2 = α. The top row gives x1 + 3x2 − 2x3 = 0, so x1 + 3α− 2α = 0 so x1 = −α.
NS(A) =
α −1
11
| α ∈ R
.Inhomogeneous case Ax = b. We did this in the previous example.We found
x =
x1x2x3
=
8− α−1 + αα
=
8−10
+ α
−111
= p + y,
where p =
8−10
is a particular solution of Ax = b and y is any element in the nullspace NS(A).
12.3.4 Relation between homogeneous & inhomogeneous systems
The solution of the homogeneous Ax = 0 and inhomogeneous Ax = b systems are related.
Let A be a matrix, and consider the system
(∗) Ax = b.
Let p be a fixed solution of (∗). Then the complete set of solutions of (∗) is
{x = p + y | y ∈ NS(A)}.
12.4 Gauss-Jordan elimination 141
Proof: If p is a solution and y ∈ NS(A) then A(p + y) = Ap +Ay = b + 0 = b. So every x of the formp + y is a solution.
Conversely, if x is a solution of (∗) then Ax = b = Ap. Thus A(x− p) = Ax− Ap = b− b = 0. So lety = x− p ∈ NS(A). Then x = p + y has the stated form.
12.3.5 Summary
There are 3 possible behaviours for a system:
1. The system Ax = b may have no solutions. (Inconsistent case.)
2. If a solution to Ax = b exists and NS(A) = {0} then Ax = b has a unique solution.
3. If Ax = b has a solution x = p and NS(A) 6= {0} then there are infinitely many solutions: all x ofthe form x = p + y with y ∈ NS(A).
Note that a homogeneous system is always consistent (case (1) cannot apply) because the 0 vector isalways a solution.
12.4 Gauss-Jordan elimination
Gauss-Jordan (or simply Jordan) reduction involves performing more EROs on the Gauss reduced formof a matrix until it has the additional property that each leading 1 is the only non-zero entry in itscolumn.
Solvex1 + 2x2 − 3x3 = −1
−3x1 − 4x2 + 8x3 = 32x1 + 5x2 − 8x3 = −5
12.4.1 Example of Gauss-Jordan Reduction
This is example 12.2.3 again. The augmented matrix is: 1 2 −3 −1−3 −4 8 3
2 5 −8 −5
∼ 1 2 −3 −1
0 1 −2 −30 0 1 2
This is the Gauss reduced form. But we can continue doing row operations.
∼
1 2 0 50 1 0 10 0 1 2
∼
1 0 0 30 1 0 10 0 1 2
r1 r1 + 3r3 r1 r1 − 2r2r2 r2 + 2r3
This is the Jordan reduced form. We can read off the solution directly: x1 = 3, x2 = 1, x3 = 2.
12.4 Gauss-Jordan elimination 142
12.4.2 The difference between Gauss and Gauss-Jordan reduction
Gauss reduction leads to a matrix with 0’s below the leading 1’s, such as1 ∗ ∗ ∗0 1 ∗ ∗0 0 0 10 0 0 0
while Gauss-Jordan reduction results in a matrix with 0’s above and below the leading 1’s such as
1 0 3 0
0 1 2 0
0 0 0 10 0 0 0
.
Jordan form has the advantage that the solution of the original system can be read off directly with littleor no back substitution. However Gaussian reduction is often preferred for solving systems.
Gauss-Jordan reduction is better if we want to solve many systems Ax = b, Ax = c etc with the samecoefficient matrix A, since it eliminates the need to do multiple back substitution steps. This is useful inthe next chapter.
12.4 Gauss-Jordan elimination 143
Notes
13.1 The Identity Matrix 144
13 Inverses
For the rest of the course unless noted otherwise the matrices will be square.
13.1 The Identity Matrix
An n× n matrix (m = n) is called a square matrix of order n. The diagonal containing the entries
a11, a22, · · · , ann
is called the main diagonal (or principal diagonal) of A. If the entries above this diagonal are all zero thenA is called lower triangular. If all the entries below the diagonal are zero, A is called upper triangular
a11 0 0 · · · 0... a22 0 · · · 0
. . .. . .
......
. . . 0an1 · · · · · · ann
lower triangular
a11 · · · · · · a1n
0 a22...
0 0. . .
......
. . .. . .
...0 0 · · · 0 ann
upper triangular
If elements above and below the principal diagonal are zero, so
aij = 0 , i 6= j
then A is called a diagonal matrix. Note that if A is diagonal then A = AT .
13.1.1 Example
A =
1 0 00 −3 00 0 2
is a diagonal 3× 3 matrix.
13.1.2 Definition: Identity matrix
The n× n identity matrix I = In, is the diagonal matrix whose entries are all 1:
I =
1 0 0 · · · 00 1 0 · · · 00 0 1 · · · 0...
. . ....
0 0 · · · · · · 1
.
Note that I = IT .
13.1.3 Comments on the identity matrix
Recall from linear transformations of R2: Ai is the first column of A and Aj is the second column.
For example, for the 2× 2 case we have
AI =
(a11 a12a21 a22
)(1 00 1
)=
(a11 a12a21 a22
)= A = IA.
13.2 Definition: Inverse 145
In general the jth column vector of I is the coordinate vector
ej =
0...010...0
← j.
Thus
Aej =
a11 · · · a1j · · · a1na21 · · · a2j · · · a2n...
......
an1 · · · anj · · · ann
0...010...0
← j
=
a1ja2j...anj
=jth columnvector ofA.
So AI = A for all A.
Replacing A by AT , AT I = AT also. Transposing both sides: A = (AT )T = (AT I)T = IT (AT )T = IA,so IA = A for all A.
We have proved that for all square matrices A
IA = AI = A.
13.2 Definition: Inverse
Let A be an n× n square matrix and let I denote the n× n identity matrix.
An n× n matrix A is invertible (or non-singular) if there exists an n× n matrix B such that
AB = BA = I.
Then B is called the inverse of A and is denoted A−1. A matrix that is not invertible is also said to besingular.
Note that if A is invertible with inverse B (so A−1 = B), then AB = BA = I, which also says B isinvertible, with inverse A (so B−1 = A). That is,
A invertible =⇒ A−1 invertible, and(A−1
)−1= A.
13.2 Definition: Inverse 146
13.2.1 Inverse for the 2 × 2 case
Let A be a 2× 2 matrix A =
(a bc d
). Set B =
(d −b−c a
). Then
AB =
(a bc d
)(d −b−c a
)=
(ad− bc 0
0 ad− bc
)= BA.
Let∆ = ad− bc.
If ∆ 6= 0, then
A−1 =1
∆B =
1
∆
(d −b−c a
)We call ∆ the determinant of A. We will consider determinants in more detail later (chapter 14). Theproblem of obtaining inverses of larger square matrices will also be considered later (section 13.2.7).
13.2.2 Example
Find the inverse matrix of A =
(1 23 5
). Check your answer.
∆ = 1× 5− 2× 3 = −1.
⇒ A−1 =1
−1
(5 −2−3 1
)=
(−5 2
3 −1
).
Check whether AA−1 = A−1A = I:Multiplying the two matrices A and A−1 gives(
1 23 5
)(−5 2
3 −1
)=
(1 00 1
),(
−5 23 −1
)(1 23 5
)=
(1 00 1
)So A is invertible with inverse (
−5 23 −1
).
∆ = 1× 5− 2× 3 = −1.
⇒ A−1
=1
−1
(5 −2−3 1
)=
(−5 2
3 −1
).
Check whether AA−1 = A−1A = I:Multiplying the two matrices A and A−1 gives
(1 23 5
)(−5 2
3 −1
)=
(1 00 1
),(
−5 23 −1
)(1 23 5
)=
(1 00 1
)
So A is invertible with inverse (−5 2
3 −1
).
13.2.3 Example
Show that A =
(1 23 6
)is not invertible.
13.2 Definition: Inverse 147
Suppose A has inverse
(a bc d
).
Then
(1 23 6
)(a bc d
)=
(1 00 1
)⇒
a+ 2c = 1 (1)b+ 2d = 0 (2)3a+ 6c = 0 (3)3b+ 6d = 1 (4)
.
Hence 3 · (1)− (3)⇒ 0 = 3. This is impossible, so A can’t have an inverse.
Note that ∆ = 6− 2 · 3 = 0 in this case.
Later we shall see that a square matrix is invertible exactly when its determinant is not 0. Sup-
pose A has inverse
(a bc d
).
Then
(1 23 6
)(a bc d
)=
(1 00 1
)⇒
a + 2c = 1 (1)b + 2d = 0 (2)3a + 6c = 0 (3)3b + 6d = 1 (4)
.
Hence 3 · (1)− (3)⇒ 0 = 3. This is impossible, so A can’t have an inverse.
Note that ∆ = 6− 2 · 3 = 0 in this case.
Later we shall see that a square matrix is invertible exactly when its determinant is not 0.
13.2.4 Example: Inverse of a 3× 3 Matrix
Let A =
2 −3 −11 −2 −3−2 2 −5
and B =
16 −17 711 −12 5−2 2 −1
. Show that B = A−1.
Just multiply the two matrices together.
AB =
2 −3 −11 −2 −3−2 2 −5
16 −17 711 −12 5−2 2 −1
=
1 0 00 1 00 0 1
.
So AB = I. And, similarly
BA =
16 −17 711 −12 5−2 2 −1
2 −3 −11 −2 −3−2 2 −5
=
1 0 00 1 00 0 1
.
So BA = I.Just multiply the two matrices together.
AB =
2 −3 −11 −2 −3−2 2 −5
16 −17 711 −12 5−2 2 −1
=
1 0 00 1 00 0 1
.So AB = I. And, similarly
BA =
16 −17 711 −12 5−2 2 −1
2 −3 −11 −2 −3−2 2 −5
=
1 0 00 1 00 0 1
.So BA = I.
13.2 Definition: Inverse 148
13.2.5 Further Properties of inverses
(1) If A, B are invertible so is AB, and
(AB)−1 = B−1A−1 (not A−1B−1!)
(2) A is invertible if and only if AT is invertible, in which case(AT)−1
=(A−1
)T.
13.2.6 Proof
(1) Suppose A, B have inverses A−1, B−1 respectively. We have to show that AB has inverse B−1A−1.So we just multiply AB by B−1A−1, and check that we get the identity matrix I:
(AB)B−1A−1 = A(BB−1)A−1
= AIA−1 = AA−1 = I
and similarly(B−1A−1)(AB) = I.
Therefore AB is invertible with inverse
(AB)−1 = B−1A−1.
(2) is an exercise:
Assume that A is invertible. We have to show that AT is invertible, which we do by checking that itsinverse is (A−1)T . To show that one matrix is the inverse of the other, we just multiply them togetherand check we get the identity. We will need the fact that (AB)T = BTAT . Setting B = A−1 in this, wehave
(A−1)TAT = (AA−1)T = IT = I.
SimilarlyAT (A−1)T = (A−1A)T = IT = I.
Therefore if A is invertible then AT is invertible with inverse
(AT )−1 = (A−1)T
To prove the converse, just replace A with AT in the above calculation. Assume that A is invertible. We have to show
that AT is invertible, which we do by checking that its inverse is (A−1)T . To show that one matrix is the inverse of the other, we just multiply them
together and check we get the identity. We will need the fact that (AB)T = BTAT . Setting B = A−1 in this, we have
(A−1
)TAT
= (AA−1
)T
= IT
= I.
Similarly
AT
(A−1
)T
= (A−1A)T
= IT
= I.
Therefore if A is invertible then AT is invertible with inverse(AT
)−1
= (A−1
)T
To prove the converse, just replace A with AT in the above calculation.
13.2.7 Inverses using Gauss–Jordan reduction
We can now give a systematic way of finding inverses of matrices, using Gauss–Jordan reduction.
Let A be an invertible n× n matrix. We want to find the inverse X of A. Let us try to solve
AX = I.
13.3 Algorithm to find the inverse matrix 149
13.3 Algorithm to find the inverse matrix
Above we looked at the matrix A =
2 −3 11 −2 −3−2 2 5
. We want a matrix X =
x1 y1 z1x2 y2 z2x3 y3 z3
with
AX = I.
Let x =
x1x2x3
, y =
y1y2y3
, z =
z1z2z3
. We need Ax =
100
, Ay =
010
, Az =
001
.
So Ax = i, Ay = j, Az = k. We can solve these 3 systems by augmenting A with all 3 RHSvectors: (A | i j k). That is, we form (A | I). We use Gauss-Jordan reduction to solve the 3 systems atonce and find x, y, z.
Let A =
2 −3 −11 −2 −3−2 2 −5
. Find X with AX = I. (This is example 13.2.4).
13.3 Algorithm to find the inverse matrix 150
Perform Gauss-Jordan reduction on the augmented matrix (A | I): 2 −3 −1 1 0 01 −2 −3 0 1 0−2 2 −5 0 0 1
∼
1 −2 −3 0 1 02 −3 −1 1 0 0−2 2 −5 0 0 1
r1 ←→ r2 1 −2 −3 0 1 0
0 1 5 1 −2 00 −2 −11 0 2 1
∼
1 0 7 2 −3 00 1 5 1 −2 00 0 −1 2 −2 1
r2 r2 − 2r1 r1 r1 + 2r2r3 r3 + 2r1 r3 r3 + 2r2 1 0 7 2 −3 0
0 1 5 1 −2 00 0 1 −2 2 −1
∼
1 0 0 16 −17 70 1 0 11 −12 50 0 1 −2 2 −1
r3 −r3 r1 r1 − 7r3
r2 r2 − 5r3
So AX = I if we take X =
16 −17 711 −12 5−2 2 −1
.
Compare this to 13.2.4.Perform Gauss-Jordan reduction on the augmented matrix (A | I):
2 −3 −1 1 0 01 −2 −3 0 1 0−2 2 −5 0 0 1
∼
1 −2 −3 0 1 02 −3 −1 1 0 0−2 2 −5 0 0 1
r1 ←→ r2 1 −2 −3 0 1 0
0 1 5 1 −2 00 −2 −11 0 2 1
∼
1 0 7 2 −3 00 1 5 1 −2 00 0 −1 2 −2 1
r2 r2 − 2r1 r1 r1 + 2r2r3 r3 + 2r1 r3 r3 + 2r2 1 0 7 2 −3 0
0 1 5 1 −2 00 0 1 −2 2 −1
∼
1 0 0 16 −17 70 1 0 11 −12 50 0 1 −2 2 −1
r3 −r3 r1 r1 − 7r3
r2 r2 − 5r3
So AX = I if we take X =
16 −17 711 −12 5−2 2 −1
.Compare this to 13.2.4.
In general, let xj be the jth column of the matrix X, and recall that ej is the jth column of I. Breakingthe previous matrix equation AX = I into one equation for each column gives rise to n equations
Ax1 = e1, Ax2 = e2, . . . , Axn = en. (13.1)
We can regard each of these as a system to be solved. As we noted when introducing the augmentedmatrix, we can solve more than one system of equations at once by forming a larger augmented matrix.Eg
Ax = b, Ax = c 99K (A | b c)
Thus we may solve (13.1) by applying Gauss–Jordan elimination to the augmented matrix
(A | e1 e2, . . . , en) = (A | I).
Reduce until we reach(I | X).
The solution X can now be read off, so AX = I. If this reduction cannot be completed, A has no inverse.
13.4 Justification of the algorithm: Left and Right inveres 151
13.3.1 Inverses via Gauss–Jordan. Summary
• If A is n× n, to find A−1:
• Set up augmented n× 2n matrix (A | I).
• If this can be reduced to (I | X) then AX = I.
• If it cannot be reduced, then A is not invertible.
13.4 Justification of the algorithm: Left and Right inveres
Let A be an n× n square matrix and let I denote the n× n identity matrix.
Above we found X with AX = I. However for X to be the inverse of A, the definition of the inverse matrixrequires XA = I also. We shall show the surprising fact that if AX = I then XA = I automatically, forsquare matrices.
We need a little terminology.
If there is an n× n square matrix B such that AB = I then B is called a right inverse for A. If there isan n× n square matrix C such that CA = I then C is called a left inverse for A.
What we previously called an inverse for A (also called a two-sided inverse for A) is a matrix D that isboth a left and right inverse for A, which is to say AD = I = DA.
“A has an inverse D = A−1 ” means D is both a left and right inverse for A. We never write A−1 for aone-sided inverse of A.
Our algorithm above for finding “the inverse” takes a matrix A, and produces a right inverse for A. Doesthis method always produce an actual (two-sided) inverse for A?
Note that if B is a right inverse for A, then AB = I, which says A is a left inverse for B:
A has right inverse B =⇒ B has left inverse A. (♦)
Useful fact: Let X, Y , Z be n× n matrices.
XY = I, Y Z = I =⇒ X,Y, Z are all invertible, and X = Z = Y −1. (♠)
ProofX = XI = X(Y Z) = (XY )Z = IZ = Z.
Since X = Z, Y Z = I implies Y X = I. Thus XY = I = Y X, so X and Y are both invertible andX = Y −1. Finally, X is invertible and Z = X, so Z is also invertible. Proof
X = XI = X(Y Z) = (XY )Z = IZ = Z.
Since X = Z, Y Z = I implies YX = I. Thus XY = I = YX, so X and Y are both invertible and X = Y−1. Finally, X is invertible and Z = X, so Z is
also invertible.
13.6 Properties equivalent to invertibility 152
13.5 Uniqueness of the inverse
Another useful observation is that if A is invertible, then its inverse is unique.
To prove uniqueness we suppose A has two inverses B and C, and show that B must equal C. Now:
• CA = I because C is an inverse of A (and hence a left inverse).
• AB = I because B is an inverse of A (and hence a right inverse).
By (♠) [with C = X, A = Y , B = Z] we get B = C. To prove uniqueness we suppose A has two inverses B and C, andshow that B must equal C. Now:
• CA = I because C is an inverse of A (and hence a left inverse).
• AB = I because B is an inverse of A (and hence a right inverse).
By (♠) [with C = X, A = Y , B = Z] we get B = C.
13.6 Properties equivalent to invertibility
Consider the following statements about an n× n matrix A:
1. A is invertible.
2. A has a left inverse: There exists an n× n matrix C with CA = I.
3. NS(A) = {0}.
4. For every n× 1 vector b the system Ax = b is solvable, with a unique solution x.
5. A has a right inverse: There exists an n× n matrix B with AB = I.
We shall show (1) =⇒ (2) =⇒ (3) =⇒ (4) =⇒ (5) =⇒ (1) so all five properties are equivalent.
This means that if any one of properties (1), (2), (3), (4), (5) hold for a square matrix A,then all five properties hold. If any one property fails, all five fail.
13.6.1 Proof
(1) =⇒ (2) If A is invertible then it has both a left and right inverse, so (2) certainly holds for A.
(2) =⇒ (3) Suppose A has a left inverse C, so CA = I. Let x ∈ NS(A). We must show x = 0. But:x = Ix = (CA)x = C(Ax) = C 0 since x ∈ NS(A), so x = 0.
(3) =⇒ (4) If NS(A) = {0} then when we solve Ax = 0 the only solution is x = 0, so there are no freevariables. Row reducing must give
(A | 0) ∼
1 ∗ ∗ · · · ∗ 00 1 ∗ · · · ∗ 00 0 1 · · · ∗ 0...
......
. . ....
...0 0 0 · · · 1 0
,
with a leading entry in every column.
13.6 Properties equivalent to invertibility 153
Now consider solving Ax = b. The row reductions on A are the same. Only the augmented columnchanges. So we get
(A | b) ∼
1 ∗ ∗ · · · ∗ •0 1 ∗ · · · ∗ •0 0 1 · · · ∗ •...
......
. . ....
...0 0 0 · · · 1 •
.
This will have also have a unique solution x, by back substitution.
(4) =⇒ (5) This is our algorithm above. If we can solve every system involving A then we can solveAb1 = e1, Ab2 = e2, . . . where e1, e2, . . . are the standard coordinate vectors. Let B be the n×n matrixwith columns b1, b2, . . . The first column of AB will be Ab1 = e1. The second column will be Ab2 = e2etc. Thus AB = I so B is a right inverse of A.
Here is the difficult step:
(5) =⇒ (1)
Suppose (5) holds for A. That is, A has a right inverse. Call it D, so AD = I.
By (♦) this means D has a left inverse, A.
Thus, property (2) holds for D. But we have already shown (2) =⇒ (3) =⇒ (4) =⇒ (5). Therefore, since(2) holds for D, (5) must also hold for D.
That is, D has a right inverse also, E say, so DE = I.
Apply (♠) to the two equations in boxes [with A = X, D = Y , E = Z].
It follows that A is invertible, and that D is its two-sided inverse: D = A−1. This completes theproof. Suppose (5) holds for A. That is, A has a right inverse. Call it D, so AD = I.
By (♦) this means D has a left inverse, A.
Thus, property (2) holds for D. But we have already shown (2) =⇒ (3) =⇒ (4) =⇒ (5). Therefore, since (2) holds for D, (5) must also hold for D.
That is, D has a right inverse also, E say, so DE = I.
Apply (♠) to the two equations in boxes [with A = X, D = Y , E = Z].
It follows that A is invertible, and that D is its two-sided inverse: D = A−1. This completes the proof.
13.6 Properties equivalent to invertibility 154
Thus, if the Gauss-Jordan algorithm terminates and produces a right inverse D for A, then A must indeedbe invertible, and D must indeed be the unique (two-sided) inverse of A. And if the algorithm does notterminate then A does not even have a right inverse, so it certainly does not have a two-sided inverse.
In summary: the algorithm for finding A−1 works.
13.6.2 Special case: n equations in n unknowns
Consider the system Ax = b. In the special case that A is n×n and invertible, we may solve this systemby multiplication on the left by A−1 to give
A−1A︸ ︷︷ ︸I
x = A−1b⇒ unique solution x = A−1b.
This is another demonstration that (1) =⇒ (4) above:
If A is invertible, the system Ax = b has a unique solution for each b.
This is often inaccurately remembered as “n equations in n unknowns have a unique solution”. Thisstatement is only true if A is invertible.
The method of multiplying by A−1 only works if A is square and A is invertible. It is also more work tofind A−1 than to solve the system using Gaussian elimination. In general, use Gaussian elimination tosolve systems.
13.6 Properties equivalent to invertibility 155
Notes
14.1 Definition: Determinant 156
14 Determinants
14.1 Definition: Determinant
Associated to each square matrix A is a number called the determinant of A and denoted |A| or det(A).It is defined as follows.
• If A is a 1× 1 matrix, say A = (a) then |A| is defined to be a.
• If A = (aij) is 2× 2 then we define
|A| =∣∣∣∣ a11 a12a21 a22
∣∣∣∣ = a11a22 − a12a21.
We already encountered this in 13.2.1.
• In general if A = (aij) is n× n, first set
Cij = (−1)i+j
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣
a11 a12 .... a1 j−1 a1 j+1 .... a1na21 a22 .... a2 j−1 a2 j+1 .... a2n...
......
......
ai−1 1 ai−1 2 .... ai−1 j−1 ai−1 j+1 .... ai−1 nai+1 1 ai+1 2 .... ai+1 j−1 ai+1 j+1 .... ai+1 n...
......
......
an1 an2 .... an j−1 an j+1 .... ann
∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣∣called the cofactor of aij . The (n − 1) × (n − 1) determinant is obtained by omitting the ith rowand jth column from A (indicated by the horizontal and vertical lines in the matrix).
We then define|A| = a11C11 + a12C12 + · · · · · ·+ a1nC1n
which gives a recursive definition of the determinant.
Observe that (−1)i+j gives the pattern
+ − + − · · · ·− + − + · · · ·+ − + − · · · ····
Thus for a 3× 3 matrix A we have
|A| =
∣∣∣∣∣∣a11 a12 a13a21 a22 a23a31 a32 a33
∣∣∣∣∣∣ = a11C11 + a12C12 + a13C13
where
C11 =
∣∣∣∣ a22 a23a32 a33
∣∣∣∣ , C12 = −∣∣∣∣ a21 a23a31 a33
∣∣∣∣ , C13 =
∣∣∣∣ a21 a22a31 a32
∣∣∣∣ .
14.2 Properties of Determinants 157
14.1.1 Example
Find the cofactors C13 and C23 of the matrix3 5 71 0 20 3 0
.
C13 = (−1)1+3M13 = 1 ·∣∣∣∣1 00 3
∣∣∣∣ = 3,
C23 = (−1)2+3M23 = −1 ·∣∣∣∣3 50 3
∣∣∣∣ = −9.
C13 = (−1)1+3
M13 = 1 ·∣∣∣∣1 00 3
∣∣∣∣ = 3,
C23 = (−1)2+3
M23 = −1 ·∣∣∣∣3 50 3
∣∣∣∣ = −9.
14.1.2 Example
Calculate the determinant
∣∣∣∣∣∣3 5 71 0 20 3 0
∣∣∣∣∣∣.
∣∣∣∣∣∣3 5 71 0 20 3 0
∣∣∣∣∣∣ = (1) · (3) ·∣∣∣∣ 0 2
3 0
∣∣∣∣+ (−1) · (5) ·∣∣∣∣ 1 2
0 0
∣∣∣∣+ (1) · (7) ·∣∣∣∣ 1 0
0 3
∣∣∣∣= 3(0− 6)− 5(0− 0) + 7(3− 0)
= 3.
∣∣∣∣∣∣3 5 71 0 20 3 0
∣∣∣∣∣∣ = (1) · (3) ·∣∣∣∣ 0 2
3 0
∣∣∣∣ + (−1) · (5) ·∣∣∣∣ 1 2
0 0
∣∣∣∣ + (1) · (7) ·∣∣∣∣ 1 0
0 3
∣∣∣∣= 3(0− 6)− 5(0− 0) + 7(3− 0)
= 3.
14.2 Properties of Determinants
Property (1): |A| =∣∣AT ∣∣
Consider for example the 2× 2 case. Then
A =
(a11 a12a21 a22
)⇒ AT =
(a11 a21a12 a22
)
14.2 Properties of Determinants 158
Thus |A| = a11a22 − a12a21 =∣∣AT ∣∣. It can be shown that this holds for any square matrix, not just in
the 2× 2 case.
This means that any results about the rows in a general determinant is also true about the columns (sincethe rows of AT are the columns of A). In particular, any statement about the effect of row operationson determinants is also true for column operations.
Warning: We used row operations to simplify systems of equations, because they do not change thesolution. Column operations may change the solution of linear systems, so we should not use columnoperations on such systems.
Property (2): The determinant may be found by taking cofactors along any row (not just the first)or down any column. Eg. for a 3× 3 matrix A
|A| = a11C11 + a12C12 + a13C13 (definition, expansion along 1st row)= a21C21 + a22C22 + a23C23 (expansion along 2nd row)= a13C13 + a23C23 + a33C33 (expansion down 3rd column) etc.
This is useful if one row or column contains a larger number of zeros.
14.2.1 Examples
(i) Calculate
∣∣∣∣∣∣3 5 71 0 20 3 0
∣∣∣∣∣∣ by (a) expanding down the third column, (b) expanding along the third row.
Which is more efficient?
∣∣∣∣∣∣3 5 71 0 20 3 0
∣∣∣∣∣∣ = 7
∣∣∣∣ 1 00 3
∣∣∣∣− 2
∣∣∣∣ 3 50 3
∣∣∣∣ = 21− 18 = 3
(exp. down3rd. col.
)= −3
∣∣∣∣ 3 71 2
∣∣∣∣ = −3(6− 7) = 3
(exp. along
r3
)∣∣∣∣∣∣
3 5 71 0 20 3 0
∣∣∣∣∣∣ = 7
∣∣∣∣ 1 00 3
∣∣∣∣− 2
∣∣∣∣ 3 50 3
∣∣∣∣ = 21− 18 = 3
(exp. down3rd. col.
)= −3
∣∣∣∣ 3 71 2
∣∣∣∣ = −3(6− 7) = 3
(exp. along
r3
)
(ii) Find ∣∣∣∣∣∣3 6 90 0 02 4 5
∣∣∣∣∣∣ .
14.2 Properties of Determinants 159
∣∣∣∣∣∣3 6 90 0 02 4 5
∣∣∣∣∣∣ = 0 · C21 + 0 · C22 + 0 · C23
= 0 (expanding along r2)
∣∣∣∣∣∣3 6 90 0 02 4 5
∣∣∣∣∣∣ = 0 · C21 + 0 · C22 + 0 · C23
= 0 (expanding along r2)
In general the determinant of any matrix with a row or column of zeros is equal to zero.
14.2.2 Effect of Elementary Row Operations on Determinants
The calculation of large determinants from the definition is very tedious. It is much easier to calculate thedeterminant after row reducing a matrix, because the reduced matrix will contain many zeros. However,performing elementary row operations changes the value of a determinant. Luckily the changes are simple:
Property (3): A common factor in all entries of a row (or column) can be taken out as a multiplierof the determinant.
14.2.3 Example ∣∣∣∣∣∣3 6 90 1 23 4 5
∣∣∣∣∣∣ = 3C11 + 6C12 + 9C13 (expanding along r1)
= 3[C11 + 2C12 + 3C13]
= 3 ·
∣∣∣∣∣∣1 2 30 1 23 4 5
∣∣∣∣∣∣ (check!)
Property (4): If 2 different rows (or columns) of A are interchanged (ri ←→ rj or ci ←→ cj), |A|changes sign.
14.2.4 Example
We have seen
∣∣∣∣∣∣3 5 71 0 20 3 0
∣∣∣∣∣∣ = 3. Interchanging columns 2 and 3 gives
∣∣∣∣∣∣3 7 51 2 00 0 3
∣∣∣∣∣∣ = 3
∣∣∣∣ 3 71 2
∣∣∣∣ (expanding along r3)
= −3.
Let A be a matrix, and let B be the matrix obtained by interchanging two rows or columns. Thendet(A) = −det(B). If A has two equal rows (or columns) and these are interchanged then A = B sodet(A) = −det(B) = −det(A) so det(A) = 0. Hence:
14.2 Properties of Determinants 160
If A has two equal rows or columns then det(A) = 0.
Property (5): If a multiple of one row (or column) is added to another (ri ri+αrj or ci ci+αcj),the determinant is unchanged.
The above 5 properties give a shortcut for evaluating determinants. We can use a sequence of elementaryrow operations (and/or column operations) to produce a matrix with a row (or column) with at mostone non-zero entry. We then expand along this row (or column) to obtain a smaller determinant, andrepeat.
Important: row (and column) operations can change the value of the determinant, so we must alwayskeep track of which operations we perform. (Unlike doing Gaussian elimination.)
14.2.5 Example
Calculate the determinant
∣∣∣∣∣∣∣∣2 5 6 11 2 3 −23 2 4 40 2 4 6
∣∣∣∣∣∣∣∣.
∣∣∣∣∣∣∣∣2 5 6 11 2 3 −23 2 4 40 2 4 6
∣∣∣∣∣∣∣∣ =
∣∣∣∣∣∣∣∣0 1 0 51 2 3 −20 −4 −5 100 2 4 6
∣∣∣∣∣∣∣∣(r1 r1 − 2r2)
(r3 r3 − 3r2)
= (−1) ·
∣∣∣∣∣∣1 0 5−4 −5 10
2 4 6
∣∣∣∣∣∣ (expanding down column 1)
= (−1) ·
∣∣∣∣∣∣1 0 0−4 −5 30
2 4 −4
∣∣∣∣∣∣ (c3 c3 − 5c1)
= −∣∣∣∣ −5 30
4 −4
∣∣∣∣ (expanding along r1)
= (−5) · 4 ·∣∣∣∣ −1 6
1 −1
∣∣∣∣ = −20(1− 6) = 100.
∣∣∣∣∣∣∣∣2 5 6 11 2 3 −23 2 4 40 2 4 6
∣∣∣∣∣∣∣∣ =
∣∣∣∣∣∣∣∣0 1 0 51 2 3 −20 −4 −5 100 2 4 6
∣∣∣∣∣∣∣∣(r1 r1 − 2r2)
(r3 r3 − 3r2)
= (−1) ·
∣∣∣∣∣∣1 0 5−4 −5 10
2 4 6
∣∣∣∣∣∣ (expanding down column 1)
= (−1) ·
∣∣∣∣∣∣1 0 0−4 −5 30
2 4 −4
∣∣∣∣∣∣ (c3 c3 − 5c1)
= −∣∣∣∣ −5 30
4 −4
∣∣∣∣ (expanding along r1)
= (−5) · 4 ·∣∣∣∣ −1 6
1 −1
∣∣∣∣ = −20(1− 6) = 100.
Using the definition with cofactors, to calculate an n×n determinant, we have to find n cofactors, each a(n− 1)× (n− 1) determinant. For each of these we need to find (n− 1) cofactors of size (n− 2)× (n− 2)etc. The total number of calculations is about
n · (n− 1) · (n− 2) · · · 3 · 2 = n!
14.2 Properties of Determinants 161
For a 25 × 25, this is about 1025 operations. This would take thousands of years, even on a computer.Doing row reductions, only a few thousand calculations are required, which could be done in a fractionof a second on a computer.
14.2.6 Example
Evaluate
∣∣∣∣∣∣1 1 1a b ca2 b2 c2
∣∣∣∣∣∣. This matrix is called Vandermonde’s matrix, and it occurs in applications such
as signal processing, error-correcting codes, and polynomial interpolation.
∣∣∣∣∣∣1 1 1a b ca2 b2 c2
∣∣∣∣∣∣ =
∣∣∣∣∣∣1 1 10 b− a c− a0 b2 − a2 c2 − a2
∣∣∣∣∣∣ (r2 r2 − ar1)(r3 r3 − a2r1)
=
∣∣∣∣ b− a c− ab2 − a2 c2 − a2
∣∣∣∣ (expand down first column)
Now we extract a common factor from each of columns 1 and 2 giving
= (b− a)(c− a)
∣∣∣∣ 1 1b+ a c+ a
∣∣∣∣= (b− a)(c− a)[c+ a− (b+ a)]
= (b− a)(c− a)(c− b).
∣∣∣∣∣∣1 1 1a b c
a2 b2 c2
∣∣∣∣∣∣ =
∣∣∣∣∣∣1 1 10 b− a c− a0 b2 − a2 c2 − a2
∣∣∣∣∣∣ (r2 r2 − ar1)
(r3 r3 − a2r1)
=
∣∣∣∣ b− a c− ab2 − a2 c2 − a2
∣∣∣∣ (expand down first column)
Now we extract a common factor from each of columns 1 and 2 giving
= (b− a)(c− a)
∣∣∣∣ 1 1b + a c + a
∣∣∣∣= (b− a)(c− a)[c + a− (b + a)]
= (b− a)(c− a)(c− b).
14.2.7 Determinants of Triangular Matrices
Suppose A =
a11 0 · · · · · · 0
a21 a22 0...
.... . .
...0
an1 · · · · · · · · · ann
14.3 Connection with inverses 162
is lower triangular. Then by repeated expansion along r1 we obtain
|A| = a11
∣∣∣∣∣∣∣∣∣∣a22...
. . ....
...an2 · · · · · · ann
∣∣∣∣∣∣∣∣∣∣= . . . = a11 a22 · · · · · ann,
which is the product of the diagonal entries. The same result holds for upper triangular and diagonalmatrices.
In particular|I| = 1.
14.3 Connection with inverses
An important property of determinants is the following.
14.3.1 Fact (product of determinants)
Let A, B be n× n matrices. Then|AB| = |A| · |B| .
14.3.2 Theorem: Invertible matrices
A is invertible ⇐⇒ |A| 6= 0.
14.3.3 Proof
=⇒ If A is invertible, thenI = AA−1
⇒ 1 = |I| =∣∣AA−1∣∣ = |A| ·
∣∣A−1∣∣ ,So |A| 6= 0.
⇐= Follows from 14.3.4 (see below). =⇒ If A is invertible, then
I = AA−1
⇒ 1 = |I| =∣∣∣AA−1
∣∣∣ = |A| ·∣∣∣A−1
∣∣∣ ,So |A| 6= 0.
⇐= Follows from 14.3.4 (see below).
Remark: It follows immediately from the proof that if A is invertible, then
|A−1| = 1
|A|,
i.e., the inverse of the determinant is the determinant of the inverse.
14.3 Connection with inverses 163
14.3.4 A Theoretical Formula
There is an explicit formula for the inverse of A. If Cij is the cofactor of the i, j entry of A, let C be thematrix of cofactors C = (Cij). Then
A−1 =1
|A|CT .
Here CT is called the adjoint matrix of A. This formula is useful for some theoretical problems, but isuseless as a tool for calculations because it is so time consuming to calculate all the cofactors.
Never use this formula for calculating inverses!
14.3.5 Example
In the 2× 2 case we have
|A| =∣∣∣∣ a11 a12a21 a22
∣∣∣∣ = a11a22 − a12a21
and co-factorsC11 = a22 , C12 = −a21 , C21 = −a12 , C22 = a11
∴ |A| 6= 0⇒ A−1 = 1|A|
(C11 C21
C12 C22
)= 1|A|
(a22 −a12−a21 a11
)in agreement with a previous result, 13.2.1.
14.3 Connection with inverses 164
Notes
15.1 Definition: Cross product 165
15 Vector Products in 3-Space
15.1 Definition: Cross product
For vectors v, w in R3, the cross (or vector) product of v and w is a vector v ×w such that
(i) ‖v ×w‖ = ‖v‖ · ‖w‖ · | sin θ|, where θ is the angle between v and w
(ii) v ×w is perpendicular to v and w, in right hand direction from v× w (right hand rule).
x
v w
v w
Figure 36: The vector v ×w is perpendicular to both v and w in R3
15.1.1 Application: angular momentum
Vector products also arise naturally in physics and engineering eg. a particle of unit charge moving withvelocity v in a magnetic field B experiences a force F = v×B. The angular momentum of a particle ofmass m moving with velocity v is given by L = m(r × v) where r is the position vector of the particle.This is important for central force problems.
15.1.2 Notes on the cross product
(1) For v, w 6= 0
v ×w = 0⇐⇒ sin θ = 0⇐⇒ θ = 0 or 180◦
∴ v ×w = 0⇐⇒ v and w are parallel
(2) v ×w = −w × v since they have the same magnitude but opposite direction
(3) v × v = 0, since sin 0 = 0.
(4) If v ×w 6= 0, then v ×w is perpendicular to the plane determined by v and w. This is useful forobtaining the equation of a plane in R3 (see Math 1052).
15.1 Definition: Cross product 166
15.1.3 Example
i× i = j × j = k × k = 0
i× j = k = −j × i
j × k = i = −k × j
k × i = j = −i× k.
i
jk
Figure 37: This diagram is an aid in remembering the cross product between unit vectors.
15.1.4 Properties of cross product
It can be shown thatv × (w + u) = v ×w + v × u(v + w)× u = v × u + w × uv × (αw) = (αv)×w = α(v ×w) , α ∈ R.
We have seen thatv ×w = −w × v
so the commutative law fails. Also we do not have the associative law: ie. in general
v × (w × u) 6= (v ×w)× u
For example, (i× i)× k = 0× k = 0, but i× (i× k) = i× (−j) = −k.
Using the above results we can express the cross product of two vectors in terms of their components.
15.1.5 Theorem: Calculating the cross product
Suppose v =
v1v2v3
, w =
w1
w2
w3
. Then
v ×w = (v2w3 − v3w2)i− (v1w3 − v3w1)j + (v1w2 − v2w1)k
ie. v ×w =
∣∣∣∣∣∣i j k
v1 v2 v3w1 w2 w3
∣∣∣∣∣∣ =
∣∣∣∣ v2 v3w2 w3
∣∣∣∣ i− ∣∣∣∣ v1 v3w1 w3
∣∣∣∣ j +
∣∣∣∣ v1 v2w1 w2
∣∣∣∣ kcan be evaluated by expanding above determinant along the first row.
15.1 Definition: Cross product 167
15.1.6 Proof
This follows directly by expanding:
v ×w = (v1i + v2j + v3k)× (w1i + w2j + w3k)
= v1i× (w1i + w2j + w3k)
+ v2j × (w1i + w2j + w3k)
+ v3k × (w1i + w2j + w3k)
= v1w2(i× j)− v1w3(k × i)
− v2w1(i× j) + v2w3(j × k)
+ v3w1(k × i)− v3w2(j × k)
= (v2w3 − v3w2)i− (v1w3 − v3w1)j + (v1w2 − v2w1)k.
This follows directly by expanding:v ×w = (v1i + v2j + v3k)× (w1i + w2j + w3k)
= v1i× (w1i + w2j + w3k)
+ v2j × (w1i + w2j + w3k)
+ v3k× (w1i + w2j + w3k)
= v1w2(i× j)− v1w3(k× i)
− v2w1(i× j) + v2w3(j × k)
+ v3w1(k× i)− v3w2(j × k)
= (v2w3 − v3w2)i− (v1w3 − v3w1)j + (v1w2 − v2w1)k.
15.1.7 Example
Let v = (1, 2, 3), w = (4, 5, 6). Find a (non-zero) vector orthogonal to both v and w.
15.2 Application: area of a triangle 168
The vector v ×w will do:
v ×w =
∣∣∣∣∣∣i j k
1 2 34 5 6
∣∣∣∣∣∣= i
∣∣∣∣ 2 35 6
∣∣∣∣− j
∣∣∣∣ 1 34 6
∣∣∣∣+ k
∣∣∣∣ 1 24 5
∣∣∣∣= i(12− 15)− j(6− 12) + k(5− 8)
= −3i + 6j − 3k.
This is orthogonal to both v and w. The vector v ×w will do:
v ×w =
∣∣∣∣∣∣i j k
1 2 34 5 6
∣∣∣∣∣∣= i
∣∣∣∣ 2 35 6
∣∣∣∣− j
∣∣∣∣ 1 34 6
∣∣∣∣ + k
∣∣∣∣ 1 24 5
∣∣∣∣= i(12− 15)− j(6− 12) + k(5− 8)
= −3i + 6j − 3k.
This is orthogonal to both v and w.
15.2 Application: area of a triangle
Find area of the ∆ABC with vectors v, w along edges as shown in Figure 38.
θv
w
A
B
C
h
Figure 38: Find the area of the triangle ABC
Area =1
2base · perpendicular height
=1
2‖v‖ · ‖w‖ sin θ =
1
2‖v ×w‖ .
Area =1
2base · perpendicular height
=1
2‖v‖ · ‖w‖ sin θ =
1
2‖v ×w‖ .
15.3 Scalar Triple Product 169
15.2.1 Example
Find the area of a triangle with vertices A = (0, 1, 3), B = (1, 2, 1), C = (4, 1, 0) and obtain a vectorperpendicular to the plane of ∆ABC.
w =−→AB =
11−2
, v =−→AC =
40−3
Hence a vector perpendicular to the plane of ∆ABC is
v ×w =
∣∣∣∣∣∣i j k
4 0 −31 1 −2
∣∣∣∣∣∣ = 3i + 5j + 4k.
Area of the triangle
1
2‖v ×w‖ =
1
2
∥∥3i + 5j + 4k∥∥
=1
2
√9 + 25 + 16
=1
2
√50
=5√
2
2.
w =−→AB =
11−2
, v =−→AC =
40−3
Hence a vector perpendicular to the plane of ∆ABC is
v ×w =
∣∣∣∣∣∣i j k
4 0 −31 1 −2
∣∣∣∣∣∣ = 3i + 5j + 4k.
Area of the triangle
1
2‖v ×w‖ =
1
2
∥∥∥3i + 5j + 4k∥∥∥
=1
2
√9 + 25 + 16
=1
2
√50
=5√
2
2.
15.3 Scalar Triple Product
The number u · (v ×w) is called the scalar triple product of u, v, w.
Using the result of the previous theorem, express this product in terms of components.
15.4 Geometrical Interpretation 170
Set u = u1i + u2j + u3k. Then
u · (v ×w) = (u1i + u2j + u3k) ·(∣∣∣∣v2 v3w2 w3
∣∣∣∣ i− ∣∣∣∣v1 v3w1 w3
∣∣∣∣ j +
∣∣∣∣v1 v2w1 w2
∣∣∣∣k)= u1
∣∣∣∣ v2 v3w2 w3
∣∣∣∣− u2 ∣∣∣∣ v1 v3w1 w3
∣∣∣∣+ u3
∣∣∣∣ v1 v2w1 w2
∣∣∣∣=
∣∣∣∣∣∣u1 u2 u3v1 v2 v3w1 w2 w3
∣∣∣∣∣∣ .This is a 3× 3 determinant. Set u = u1i + u2j + u3k. Then
u · (v ×w) = (u1i + u2j + u3k) ·(∣∣∣∣v2 v3w2 w3
∣∣∣∣ i− ∣∣∣∣v1 v3w1 w3
∣∣∣∣ j +
∣∣∣∣v1 v2w1 w2
∣∣∣∣k)= u1
∣∣∣∣ v2 v3w2 w3
∣∣∣∣− u2
∣∣∣∣ v1 v3w1 w3
∣∣∣∣ + u3
∣∣∣∣ v1 v2w1 w2
∣∣∣∣=
∣∣∣∣∣∣u1 u2 u3v1 v2 v3w1 w2 w3
∣∣∣∣∣∣ .This is a 3× 3 determinant.
15.4 Geometrical Interpretation
The volume V of a parallelepiped determined by vectors u, v, w along edges as shown is given by
V = |u · (v ×w)|
CD
BA
xv w
w
v
u
Figure 39: Parallelepiped determined using vectors.
15.4 Geometrical Interpretation 171
15.4.1 Proof
V = (area of base ABCD)× h where
h = perpendicular height
= size of component of u in direction of v ×w
=
∣∣∣∣u · (v ×w)
‖v ×w‖2(v ×w)
∣∣∣∣ =|u · (v ×w)|‖v ×w‖
From previous example,
area of the base ABCD = 2× (area ∆ABD)
= ‖v ×w‖
⇒ V = ‖v ×w‖ · h = |u · (v ×w)| .V = (area of base ABCD)× h where
h = perpendicular height
= size of component of u in direction of v ×w
=
∣∣∣∣∣u · (v ×w)
‖v ×w‖2(v ×w)
∣∣∣∣∣ =|u · (v ×w)|‖v ×w‖
From previous example,
area of the base ABCD = 2× (area ∆ABD)
= ‖v ×w‖
⇒ V = ‖v ×w‖ · h = |u · (v ×w)| .
15.4.2 Example
Find the volume V of the parallelepiped determined by the vectors
a =
121
, b =
111
, c =
001
.
V = |a · (b× c)|, where
a · (b× c) =
∣∣∣∣∣∣1 2 11 1 10 0 1
∣∣∣∣∣∣ = 1 ·∣∣∣∣ 1 2
1 1
∣∣∣∣ = 1− 2 = −1 (expanded along r3)
⇒ V = |−1| = 1.
V = |a · (b× c)|, where
a · (b× c) =
∣∣∣∣∣∣1 2 11 1 10 0 1
∣∣∣∣∣∣ = 1 ·∣∣∣∣ 1 2
1 1
∣∣∣∣ = 1− 2 = −1 (expanded along r3)
⇒ V = |−1| = 1.
15.4 Geometrical Interpretation 172
The volume V of a parallepiped determined by the vectors u, v, w vanishes if and only if vectors areco-planar; therefore vectors are co-planar if and only if∣∣∣∣∣∣
u1 u2 u3v1 v2 v3w1 w2 w3
∣∣∣∣∣∣ = 0.
This gives a useful method for determining when 3 vectors are co-planar.
15.4 Geometrical Interpretation 173
Notes
16.1 Ranking Webpages 174
16 Eigenvalues and eigenvectors
16.1 Ranking Webpages
Web search engines assign a number to each webpage, called the pagerank. The higher the pagerank, thehigher that webpage is listed in websearches.
If a webpage links to 3 other pages, it passes one third of its rank to each of those pages. The pagerankof a webpage is calculated by summing all the contributions from in-linking pages.
16.1.1 Example
Four webpages link to each as follows:
1 //
��
2oo
3
OO
//
@@
4
OO^^
Let the ranks of the pages be r1, r2, r3, r4.
The only page linking in to 4 is page 3. Page 3 links to 3 pages (1, 2, 4), so it passes 1/3 of its rank onto each. So
r4 =1
3r3.
The rank of page 3 is given by:
The only page linking in to 3 is page 1. Page 1 links to 2 pages (2, 3), so it passes 1/2 of its rank on toeach. So
r3 =1
2r1.
The only page linking in to 3 is page 1. Page 1 links to 2 pages (2, 3), so it passes 1/2 of its rank on to each. So
r3 =1
2r1.
The pages linking in to page 2 are 1, 3 and 4. So
r2 =1
2r1 +
1
3r3 +
1
2r4.
r1 gets weight 1/2 because page 1 links to 2 pages. r3 gets weight 1/3 because page 3 links to 3 pages.
r2 =1
2r1 +
1
3r3 +
1
2r4.
r1 gets weight 1/2 because page 1 links to 2 pages. r3 gets weight 1/3 because page 3 links to 3 pages.
Altogether we get the following system of equations:
16.1 Ranking Webpages 175
r1 = r2 + 13r3 + 1
2r4
r2 = 12r1 + 1
3r3 + 12r4
r3 = 12r1
r4 = 13r3
r1 = r2 + 13r3 + 1
2r4
r2 = 12r1 + 1
3r3 + 1
2r4
r3 = 12r1
r4 = 13r3
This can be written as Ar = r where
A =
0 1 13
12
12 0 1
312
12 0 0 0
0 0 13 0
A =
0 1 13
12
12
0 13
12
12
0 0 0
0 0 13
0
Solving this system (exercise) gives r1 = 12, r2 = 9, r3 = 6, r4 = 2 is a solution, so page 1 is rankedhighest, and page 4 lowest.
In general, the pagerank of v is calculated by adding the weight of all the pages w1, . . . , wn that link into v:
r(v) =∑
all pages w1, . . . , wn that link to v
r(wj)
nj.
where page wj links to nj pages.
Problem: to find r(v) we need to know r(w1), . . . , r(wn). How do we ever get started?
Idea: given a set of N webpages w1, . . . , wN , create the N ×N matrix A = (aij), where
aij =
{1nj, if page wj links to wi
0, otherwise
Let
r =
r1r2...rN
.
16.2 Geometry of eigenvectors 176
Then the pagerank equation isAr = r. (16.1)
This is an N ×N system, but not one that we have seen before because the RHS is not constant. Theunknown vector r appears on both sides of the equation.
Equation (16.1) is called an eigenvector equation. The vector r is called an eigenvector of the matrix A.
Instead of the equation Ax = x, sometimes we often encounter equations like Ax = 2x. This is calledan eigenvector problem with eigenvalue 2. Equation (16.1) is a special case with eigenvalue 1.
16.2 Geometry of eigenvectors
Consider the matrix A =
(2 11 2
). This acts as a linear transformation, mapping i 7→
(21
)and
j 7→(
12
).
A maps i 7→(
21
)and j 7→
(12
).
Draw a vector that is unmoved under the action of A.
16.2 Geometry of eigenvectors 177
The vector
(−11
)is unchanged.
Check: A
(−11
)=
(2 11 2
)(−11
)=
(−11
).
For most vectors Av 6= v, but
(−11
)is special for the matrix A. It is called an eigenvector of A.
The vector
(−11
)is unchanged.
Check: A
(−11
)=
(2 11 2
)(−11
)=
(−11
).
For most vectors Av 6= v, but
(−11
)is special for the matrix A. It is called an eigenvector of A.
Instead of Ax = x we often encounter equations like Ax = 2x. Then 2 is called an eigenvalue, and x acorresponding eigenvector. Above we looked at the special case of eigenvalue 1.
16.3 Eigenvalues & Vectors 178
16.2.1 Other applications
Eigenvalues and eigenvectors have many other applications e.g. in Quantum Mechanics the energy of asystem is represented by a matrix whose eigenvalues give the allowed energies. This is why discrete energylevels are observed. Another situation where they are used is in modelling the population dynamics of apredator-prey system in biology. The eigenvalues and eigenvectors determine the stability and dynamicsof a population in such models.
There are many other applications in Mathematics, Physics, Chemistry and Biology.
16.3 Eigenvalues & Vectors
Let A be an n× n matrix and consider the equation
Ax = λx (16.2)
with λ a number (possibly complex). We are given the matrix A, and need to find λ and x.
Obviously x = 0 is always a solution to (16.2) (for any λ). A value λ for which (16.2) has solutionx 6= 0 is called an eigenvalue of A. The corresponding solution vector x 6= 0 is called an eigenvector ofA corresponding to the eigenvalue λ.
Note that λ is allowed to be 0, but x is not allowed to be 0.
16.3.1 Example
Let A =
(1 34 2
), λ = −2 and x =
(−1
1
). Show that λ is an eigenvalue for A, with corresponding
eigenvector x.
We have to check that Ax = λx.
Ax =
(1 34 2
) (−1
1
)=
(2−2
)= −2
(−1
1
)= λx.
Check that another eigenvalue is λ = 5 with corresponding eigenvector
(34
). We have to check that Ax = λx.
Ax =
(1 34 2
) (−1
1
)=
(2−2
)= −2
(−1
1
)= λx.
Check that another eigenvalue is λ = 5 with corresponding eigenvector
(34
).
16.3.2 Note
Usually Ax is related to x in a much more complicated way,
e.g.
(1 34 2
) (12
)=
(78
)6= λ
(12
).
16.4 How to find eigenvalues 179
16.4 How to find eigenvalues
We haveAx = λx
⇐⇒ Ax = λ(Ix)⇐⇒ Ax− λIx = 0⇐⇒ (A− λI)x = 0.
We seek a non-zero solution x of this equation.
Recall that if B is an n × n matrix and detB 6= 0 then Bx = 0 has a unique solution, which must bex = 0 (§ 13.4 and 14.3.2). Since we want a non-zero solution x above, we must have det(A− λI) = 0.
Thus A has eigenvalue λ only if
|A− λI| = 0. (16.3)
We will explore eigenvalues and eigenvectors in a MATLAB module.
16.4.1 Definition: Characteristic polynomial, characteristic equation
The determinantp(λ) = |A− λI|
is a polynomial of degree n in λ called the characteristic polynomial of A. We call equation (16.3) thecharacteristic equation.
Thus the eigenvalues of A are given by the roots of the characteristic equation (16.3). This means thatA has at least 1 and at most n distinct eigenvalues (possibly complex). The corresponding eigenvectorsmay then be obtained for each λ in turn by solving
(A− λI)x = 0
using Gaussian elimination.
16.4 How to find eigenvalues 180
16.4.2 Example
Find the eigenvalues and eigenvectors of
A =
(−5 2
2 −2
).
We have
A− λI =
(−5− λ 2
2 −2− λ
).
So
p(λ) = |A− λI| =∣∣∣∣ −5− λ 2
2 −2− λ
∣∣∣∣ = (λ+ 5)(λ+ 2)− 4
= λ2 + 7λ+ 6 = (λ+ 6)(λ+ 1).
Solving p(λ) = 0 yields the eigenvalues λ1 = −1, λ2 = −6.To get the corresponding eigenvectors, solve (A− λi)x = 0 for i = 1, 2.
λ1 = −1 :
(−4 2 0
2 −1 0
)∼(
1 −12 0
2 −1 0
)∼(
1 −12 0
0 0 0
)with general solution
x =
(α
2α
)= α
(1
2
)⇒(
1
2
)is an eigenvector corresponding
to the eigenvalue λ = −1;
λ2 = −6 :
(1 2 02 4 0
)∼(
1 2 00 0 0
)⇒ general solution x =
(−2α
α
)= α
(−2
1
).
Hence
(−2
1
)is an eigenvector corresponding to eigenvalue λ = −6.
We have
A− λI =
(−5− λ 2
2 −2− λ
).
So
p(λ) = |A− λI| =∣∣∣∣ −5− λ 2
2 −2− λ
∣∣∣∣ = (λ + 5)(λ + 2)− 4
= λ2 + 7λ + 6 = (λ + 6)(λ + 1).
Solving p(λ) = 0 yields the eigenvalues λ1 = −1, λ2 = −6.To get the corresponding eigenvectors, solve (A− λi)x = 0 for i = 1, 2.
λ1 = −1 :
(−4 2 0
2 −1 0
)∼(
1 − 12
02 −1 0
)∼(
1 − 12
00 0 0
)with general solution
x =
(α
2α
)= α
(1
2
)⇒(
1
2
)is an eigenvector corresponding
to the eigenvalue λ = −1;
λ2 = −6 :
(1 2 02 4 0
)∼(
1 2 00 0 0
)
⇒ general solution x =
(−2α
α
)= α
(−2
1
).
Hence
(−2
1
)is an eigenvector corresponding to eigenvalue λ = −6.
16.4 How to find eigenvalues 181
16.4.3 Example
Find the eigenvalues and eigenvectors of A =
(0 2−2 0
).
p(λ) = |A− λI| =∣∣∣∣ −λ 2−2 −λ
∣∣∣∣ = λ2 + 4.
Eigenvalues: p(λ) = 0⇒ λ2 + 4 = 0⇒ λ = ±2i⇒ λ1 = 2i; λ2 = −2i.
In this case A has two complex eigenvalues λ = ±2i. To obtain eigenvectors solve (A − λI)x = 0,λ = ±2i, exactly as before, using complex numbers. The augmented matrix (A− λI | 0) correspondingto λ1 = 2i is (
−2i 2 0−2 −2i 0
)∼(−2i 2 0
0 0 0
)∼(
1 i 00 0 0
)⇒ x1 + ix2 = 0 ⇒ x1 = −ix2 = −iα
x2 = α
⇒x =
(−iαα
)= α
(−i
1
)is the general solution,
and
(−i
1
)is an eigenvector corresponding to λ1 = 2i.
For λ = −2i: (2i 2 0−2 2i 0
)∼(
1 −i 00 0 0
)⇒ x1 − ix2 = 0⇒ x1 = ix2 = iα
x2 = α
⇒ x =
(iαα
)= α
(i1
)is the general solution, and
(i1
)is an eigenvector for λ = −2i.
p(λ) = |A− λI| =∣∣∣∣ −λ 2−2 −λ
∣∣∣∣ = λ2 + 4.
Eigenvalues: p(λ) = 0⇒ λ2 + 4 = 0⇒ λ = ±2i⇒ λ1 = 2i; λ2 = −2i.
In this case A has two complex eigenvalues λ = ±2i. To obtain eigenvectors solve (A− λI)x = 0, λ = ±2i, exactly as before, using complex numbers. Theaugmented matrix (A− λI | 0) corresponding to λ1 = 2i is
(−2i 2 0−2 −2i 0
)∼(−2i 2 0
0 0 0
)∼(
1 i 00 0 0
)
⇒ x1 + ix2 = 0 ⇒ x1 = −ix2 = −iαx2 = α
⇒x =
(−iαα
)= α
(−i
1
)is the general solution,
and
(−i
1
)is an eigenvector corresponding to λ1 = 2i.
For λ = −2i: (2i 2 0−2 2i 0
)∼(
1 −i 00 0 0
)⇒ x1 − ix2 = 0⇒ x1 = ix2 = iα
x2 = α
⇒ x =
(iαα
)= α
(i1
)is the general solution, and
(i1
)is an eigenvector for λ = −2i.16.4.4 Check
Solving the system (A − λI)x = 0 to find eigenvectors always leads to at least one row of zeros whenA− λI is row reduced. This is because λ is chosen so that (A− λI)x = 0 has a solution x 6= 0.
So A− λI is not invertible and so A− λI does not reduce to I.
16.4 How to find eigenvalues 182
Notes
17.1 Linear combinations 183
17 Vector Spaces
In this chapter we consider sets of vectors with special properties, called vector spaces.
Recall that if A and B are sets then A ⊆ B means A is a subset of B. The set with no elements, calledthe empty set is denoted ∅. This is not the same as the set {0} whose single element is the zero vector.
17.1 Linear combinations
In R2, every vector
(xy
)can be written using the vectors i and j:
(xy
)= x i + y j. In R3 every
vector can be written using i, j, k. We generalize this idea.
17.1.1 Definition: Linear combination
If v1,v2, ...,vm ∈ Rn, a vector of the form
v = α1v1 + α2v2 + · · ·+ αmvm , αi ∈ R
is called a linear combination of the vectors v1,v2, ...,vm.
17.1.2 Example
Show that each of the vectors
w1 =
(1
0
), w2 =
(0
1
)is a linear combination of
v1 =
(1
1
)and v2 =
(1
−1
).
17.1 Linear combinations 184
Write
w1 = α1
(1
1
)+ α2
(1
−1
).
Equating components:
α1 + α2 = 1,
α1 − α2 = 0
Solving, α1 = α2 = 1/2. So
w1 =
(1
0
)=
1
2
(1
1
)+
1
2
(1
−1
)=
1
2v1 +
1
2v2
Similarly we get
w2 =
(0
1
)=
1
2
(1
1
)− 1
2
(1
−1
)=
1
2v1 −
1
2v2.
Write
w1 = α1
(1
1
)+ α2
(1
−1
).
Equating components:
α1 + α2 = 1,
α1 − α2 = 0
Solving, α1 = α2 = 1/2. So
w1 =
(1
0
)=
1
2
(1
1
)+
1
2
(1
−1
)
=1
2v1 +
1
2v2
Similarly we get
w2 =
(0
1
)=
1
2
(1
1
)−
1
2
(1
−1
)
=1
2v1 −
1
2v2.
17.1.3 Example
Every v =
v1v2...vn
∈ Rn is a linear combination of the coordinate vectors ei:
v = v1e1 + v2e2 + · · ·+ vnen. (17.1)
17.3 How to test for linear independence 185
17.2 Linear independence
17.2.1 Definition: Linear independence
Consider the linear combinationα1v1 + α2v2 + · · ·+ αmvm
with α1 = α2 = · · · = αn = 0. Obviously this gives the 0 vector.
A set of vectors S = {v1, v2, ..., vm} ⊆ Rn is linearly dependent if there exist scalars α1, ..., αm not allzero such that
α1v1 + α2v2 + · · ·+ αmvm = 0.
S is called linearly independent if no such scalars exist, i.e. if the only linear combination adding up to0 is with all the scalars 0. So S is linearly independent if
α1v1 + α2v2 + · · ·+ αmvm = 0 ⇒ α1 = α2 = · · · = αm = 0
17.3 How to test for linear independence
Given vectors v1, . . . ,vm, solve the vector equation
α1v1 + · · ·+ αmvm = 0.
If the only solution is α1, . . . , αm = 0 then the vectors are linearly independent.
If there is any other solution, the vectors are linearly dependent.
17.3.1 Example
Show that in R2, the vectors
(11
),
(1−1
)are linearly independent.
α
(11
)+ β
(1−1
)=
(00
)⇒(α+ βα− β
)=
(00
)⇒ α+ β = 0
α− β = 0
}⇒ α = β = 0.
α
(11
)+ β
(1−1
)=
(00
)⇒(
α + βα− β
)=
(00
)⇒ α + β = 0
α− β = 0
}⇒ α = β = 0.
17.3.2 Example
Show that in Rn the coordinate vectors ei are linearly independent.
17.3 How to test for linear independence 186
α1e1 + α2e2 + · · ·+ αnen = 0
⇒
α1
0...0
+
0α2...0
+ · · ·+
0...0αn
=
α1
α2...αn
=
00...0
.
⇒ α1 = 0, α2 = 0, . . . , αn = 0.
α1e1 + α2e2 + · · · + αnen = 0
⇒
α10
.
.
.0
+
0α2
.
.
.0
+ · · · +
0
.
.
.0αn
=
α1α2
.
.
.αn
=
00
.
.
.0
.
⇒ α1 = 0, α2 = 0, . . . , αn = 0.
17.3.3 Example
Show that in R3, the vectors
131
,
11−1
,
011
are linearly dependent.
α1
131
+ α2
11−1
+ α3
011
=
000
1 1 0 0
3 1 1 01 −1 1 0
∼ 1 1 0 0
0 −2 1 00 −2 1 0
∼ 1 1 0 0
0 1 −12 0
0 0 0 0
.
x3 is not determined. Let x3 = α. Then x2 = 12α, x1 = −1
2α, so x = α
−12
121
for any α. If we take
α = −2, x = α
1−1−2
is a solution.
Check: 131
− 1
1−1
− 2
011
=
000
.
We have found a non-trivial linear combination summing to 0, so the vectors are linearly dependent.
α1
131
+ α2
11−1
+ α3
011
=
000
1 1 0 0
3 1 1 01 −1 1 0
∼ 1 1 0 0
0 −2 1 00 −2 1 0
∼ 1 1 0 0
0 1 − 12
00 0 0 0
.
x3 is not determined. Let x3 = α. Then x2 = 12α, x1 = − 1
2α, so x = α
− 12
121
for any α. If we take α = −2, x = α
1−1−2
is a solution.
Check: 131
− 1
1−1
− 2
011
=
000
.We have found a non-trivial linear combination summing to 0, so the vectors are linearly dependent.
17.3.4 Example
Show that any finite set of vectors containing the zero vector 0 is linearly dependent.
17.4 Invertible matrices and linear independence 187
Write S = {v1, ..., vm, 0} ⊆ Rn. Then
0 · v1 + · · ·+ 0 · vm + 1 · 0 = 0
⇒ S linearly dependent. Write S = {v1, ..., vm, 0} ⊆ Rn. Then
0 · v1 + · · · + 0 · vm + 1 · 0 = 0
⇒ S linearly dependent.
17.3.5 Special cases of linear dependence
• If two vectors v1,v2 are linearly dependent, then there exist scalars α1, α2 ∈ R, not both 0, with
α1v1 + α2v2 = 0.
If α1 6= 0 then v1 = −α2α1
v2, so v1 is a scalar multiple of v2. If α2 6= 0 then v2 = −α1α2
v1, so v2 is ascalar multiple of v1.
So two vectors are linearly dependent if and only if one of them is a multiple of the other.
(ii) If one vector v is linearly dependent then αv = 0 with α 6= 0, so v = 0. Thus a single vector islinearly dependent if and only if it is the zero vector.
17.4 Invertible matrices and linear independence
There is a relation between the nullspace of A and linear independence of the columns of A.
Suppose b ∈ NS(A). Let b =
b1b2· · ·bn
. That is b = b1e1 + b2e2 + · · ·+ bnen.
Since b ∈ NS(A), Ab = 0 so
0 = Ab = A
(b1e1 + b2e2 + · · ·+ bnen
)= b1(Ae1) + b2(Ae2) + · · ·+ bn(Aen).
We saw earlier that Aej is the jth column vector of A. So the right hand side is a linear combinationamong the columns of A.
Thus: if there is a vector b ∈ NS(A) with b 6= 0 then there is a non-trivial linear combination of columnsof A adding up to 0, so the columns of A are linearly dependent.
Conversely, if the columns of A are dependent there is a non-trivial linear combination of columns ofA adding up to 0, say b1(Ae1) + b2(Ae2) + · · · + bn(Aen) = 0. This gives a non-zero vector b =
b1e1 + b2e2 + · · ·+ bnen =
b1b2· · ·bn
in NS(A).
Thus: NS(A) = {0} if and only if the columns of A are linearly independent. But recall (§13.4) that Ais invertible exactly when NS(A) = {0}.
We have proved:
17.4 Invertible matrices and linear independence 188
17.4.1 Theorem (existence of inverse)
Let A be a square matrix. Then A is invertible if and only if the columns of A are linearly independent.
Notice this means swapping the columns of a matrix does not affect whether or not it is invertible.
Furthermore, A is invertible if and only if the rows of A are linearly independent:
A is invertible ⇐⇒ AT is invertible
⇐⇒ columns of AT are linearly independent
⇐⇒ rows of A are linearly independent.
17.4.2 Example
From example 17.3.3, the columns of
A =
1 1 03 1 11 −1 1
are linearly dependent. Therefore A is not invertible.
17.4.3 Example
Let
A =
1 1 10 1 10 0 1
It may be shown that the columns of A are linearly independent (left as an exercise, or see below).Therefore A is invertible.
17.4.4 Summary of conditions equivalent to invertibility
Let A be an n× n matrix. Give a summary of all the conditions we have found that are equivalent to Abeing invertible.
17.5 Vector spaces 189
1. A is invertible.
2. There exists an n× n matrix X with AX = XA = I [Definition of invertible, § 13.2].
3. AT is invertible [§ 13.2.5(3)].
4. The columns of A are linearly independent [§ 17.4.1].
5. The rows of A are linearly independent [§ 17.4.1].
6. Ax = b has a unique solution x for any b ∈ Rn [§ 13.4 ].
7. NS(A) = {0} [§ 13.4 ].
8. The Jordan reduced form of A is I [§ 12.4].
9. det(A) 6= 0 [Theorem 14.3.2].
10. rank A = n.
1. A is invertible.
2. There exists an n× n matrix X with AX = XA = I [Definition of invertible, § 13.2].
3. AT is invertible [§ 13.2.5(3)].
4. The columns of A are linearly independent [§ 17.4.1].
5. The rows of A are linearly independent [§ 17.4.1].
6. Ax = b has a unique solution x for any b ∈ Rn [§ 13.4 ].
7. NS(A) = {0} [§ 13.4 ].
8. The Jordan reduced form of A is I [§ 12.4].
9. det(A) 6= 0 [Theorem 14.3.2].
10. rank A = n.
17.5 Vector spaces
Consider the following sets of vectors:
• V =
{(xy
)| x+ y = 2
}. The set V is a set of (infinitely many) vectors in R2.
• The set containing just the zero vector {0}. This is a set with just one vector.
• W =
{(xy
)| x2 + y2 = 0
}. Since x2, y2 ≥ 0, the only vector in W is 0, so W = {0}.
• Another special set is the empty set, not containing any vectors at all(!) e.g.
Y =
{(xy
)| x2 + y2 = −1
}. There are no vectors in Y .
• Let V =
a
bc
∣∣∣∣ a+ b = 2c
.
17.5 Vector spaces 190
Is
103
∈ V ? No: 1+0 6= 2 ·3. But v =
111
, w =
243
∈ V : 1+1 = 2 ·1 and 2+4 = 2 ·3.
Is v + w ∈ V ? Yes: v + w =
354
and 3 + 5 = 2 · 4.
We now consider sets of vectors with certain properties, called vector spaces.
17.5.1 Definition: Vector space, subspace
A subset V ⊆ Rn is called a vector space if it satisfies the following
(o) V is non-empty.
(i) If v ∈ V and w ∈ V then v + w ∈ V (closure under vector addition).
(ii) If v ∈ V and α ∈ R then αv ∈ V (closure under scalar multiplication).
If V is a vector space, W ⊆ V and W is also a vector space, we call W a subspace of V . Note: V is asubspace of itself.
If V ⊆ Rn is a vector space and v ∈ V , then by property (ii)
0 = 0v ∈ V
that is, the zero vector belongs to every vector space.
17.5.2 Example
Show that
V =
a
bc
∣∣∣∣ a+ b = 2c, a, b, c ∈ R
is a vector space.
17.5 Vector spaces 191
(o) V is non-empty since 0 ∈ V .
Let v =
a1b1c1
, w =
a2b2c2
be in V .
(i) v + w =
a1 + a2b1 + b2c1 + c2
v ∈ V ⇒ a1 + b1 = 2c1
w ∈ V ⇒ a2 + b2 = 2c2.
Is v + w in V ?(a1 + a2) + (b1 + b2) = (a1 + b1) + (a2 + b2)
= 2c1 + 2c2 = 2(c1 + c2)
⇒ v + w ∈ V . So V is closed under addition.
(ii) If α is any scalar then αv =
αa1αb1αc1
.
Observe that αa1 + αb1 = α(a1 + b1) = α2c1 = 2(αc1)
⇒ αv ∈ V. So V is closed under scalar multiplication.
Hence V is a vector space.
(o) V is non-empty since 0 ∈ V .
Let v =
a1b1c1
, w =
a2b2c2
be in V .
(i) v + w =
a1 + a2b1 + b2c1 + c2
v ∈ V ⇒ a1 + b1 = 2c1
w ∈ V ⇒ a2 + b2 = 2c2.
Is v + w in V ?(a1 + a2) + (b1 + b2) = (a1 + b1) + (a2 + b2)
= 2c1 + 2c2 = 2(c1 + c2)
⇒ v + w ∈ V . So V is closed under addition.
(ii) If α is any scalar then αv =
αa1αb1αc1
.Observe that αa1 + αb1 = α(a1 + b1) = α2c1 = 2(αc1)
⇒ αv ∈ V. So V is closed under scalar multiplication.
Hence V is a vector space.
17.5.3 Example
Is V = {0} a vector space?
(o) 0 ∈ V ⇒ V is non-empty.
(i) If v, w ∈ V then v = w = 0 and v + w = 0 ∈ V .
(ii) If α ∈ R, then α0 = 0 ∈ V .
(o) 0 ∈ V ⇒ V is non-empty.
(i) If v, w ∈ V then v = w = 0 and v + w = 0 ∈ V .
(ii) If α ∈ R, then α0 = 0 ∈ V .
17.5 Vector spaces 192
17.5.4 Example
Show that V = Rn is a vector space.
(o) 0 ∈ V so V is non-empty
(i) v,w ∈ V ⇒ v + w ∈ Rn = V
(ii) v ∈ V ⇒ αv ∈ Rn = V
So V is a vector space.(o) 0 ∈ V so V is non-empty
(i) v,w ∈ V ⇒ v + w ∈ Rn = V
(ii) v ∈ V ⇒ αv ∈ Rn = V
So V is a vector space.17.5.5 Example
Is V =
{(12
)}a vector space?
No, since 0 /∈ V .No, since 0 /∈ V .
17.5.6 Example
Is V =
{α
(12
)| α ∈ R
}a vector space?
(o) α = 0⇒ 0 ∈ V , so V is not empty.
(i) If v,w ∈ V then v = α
(12
)and w = β
(12
)for suitable α, β ∈ R.
Then v + w =
(α2α
)+
(β2β
)=
(α+ β
2(α+ β)
)= (α+ β) ·
(12
)∈ V.
(ii) If γ ∈ R, then γv = γα
(12
)∈ V .
Therefore V is a vector space.
(o) α = 0⇒ 0 ∈ V , so V is not empty.
(i) If v,w ∈ V then v = α
(12
)and w = β
(12
)for suitable α, β ∈ R.
Then v + w =
(α2α
)+
(β2β
)=
(α + β
2(α + β)
)= (α + β) ·
(12
)∈ V.
(ii) If γ ∈ R, then γv = γα
(12
)∈ V .
Therefore V is a vector space.
Geometrically, all these vectors can be viewed as lying along a single line through the origin.
17.5 Vector spaces 193
17.5.7 Example
Show that for v1, v2 ∈ R3,V = {α1v1 + α2v2 | α1, α2 ∈ R}
is a vector space.
(o) If we choose α1 = α2 = 0 we see that 0v1 + 0v2 = 0 ∈ V , so V is non-empty.
(i) Let v,w ∈ V ⇒ there exists α1, α2, β1, β2 such that v = α1v1 + α2v2 and w = β1v1 + β2v2.Then v + w = (α1 + β1)v1 + (α2 + β2)v2 ∈ V .
(ii) For any scalar γ ∈ R, γv = γα1v1 + γα2v2 ∈ V .
Therefore V is a vector space.
(o) If we choose α1 = α2 = 0 we see that 0v1 + 0v2 = 0 ∈ V , so V is non-empty.
(i) Let v,w ∈ V ⇒ there exists α1, α2, β1, β2 such that v = α1v1 +α2v2 and w = β1v1 +β2v2. Then v +w = (α1 +β1)v1 + (α2 +β2)v2 ∈ V .
(ii) For any scalar γ ∈ R, γv = γα1v1 + γα2v2 ∈ V .
Therefore V is a vector space.
If we draw v1 and v2 starting from the origin, then all these combinations α1v1 +α2v2 will lie in a singleplane in R3, through the origin.
For example, if v1 = i and v2 = j then V = {α1i + α2j | α1, α2 ∈ R} is the set of vectors with nok-component i.e. the xy plane.
Thus geometrically, a vector space can be a single point {0}, or can be all vectors along a single linethrough the origin, or all vectors lying in a single plane through the origin, or similarly in higher dimen-sions.
Theorem: If A is a matrix, then NS(A) is a vector space.
Proof:
(o) A0 = 0 ⇒ 0 ∈ NS(A) ⇒ NS(A) is non-empty.
(i) v,w ∈ NS(A) ⇒ Av = 0, Aw = 0⇒ A(v + w) = Av +Aw = 0 + 0 = 0 ⇒ v + w ∈ NS(A)
(ii) If v ∈ NS(A), α ∈ R then Av = 0⇒ A(αv) = αAv = α0 = 0 ⇒ αv ∈ NS(A).
Hence NS(A) is a vector space.
(o) A0 = 0 ⇒ 0 ∈ NS(A) ⇒ NS(A) is non-empty.
(i) v,w ∈ NS(A) ⇒ Av = 0, Aw = 0⇒ A(v + w) = Av + Aw = 0 + 0 = 0 ⇒ v + w ∈ NS(A)
(ii) If v ∈ NS(A), α ∈ R then Av = 0⇒ A(αv) = αAv = α0 = 0 ⇒ αv ∈ NS(A).
Hence NS(A) is a vector space.
This explains why either NS(A) = {0} or else it is infinite. If there is any solution v of Av = 0, there willbe infinitely many solutions. This is why a consistent linear system can have one solution, or infinitelymany, depending on the size of NS(A).
17.6 Eigenspace 194
17.6 Eigenspace
17.6.1 Proposition (eigenspace is a vector space)
Let A be a square matrix. Let λ be an eigenvalue of A. Then the set of solutions of the equation Ax = λxforms a vector space called the eigenspace of A corresponding to eigenvalue λ, denoted Nλ. So
Nλ = {x ∈ Rn | Ax = λx}.
Special case: if λ = 0 is an eigenvalue of A then N0 = {x ∈ Rn | Ax = 0} = NS(A), the nullspace of Aas above.
17.6.2 Proof
(o) Note that 0 is always a solution to Ax = λx, so 0 ∈ Nλ, so Nλ is non-empty.
Take x,y ∈ Nλ, α ∈ R.
(i) Then
A(x + y) = Ax +Ay
= λx + λy
= λ(x + y)
⇒ x + y ∈ Nλ.
(ii)
A(αx) = αAx
= αλx
= λ(αx)
⇒ αx ∈ Nλ.
Hence the set of solutions to Ax = λx forms a vector space.
(o) Note that 0 is always a solution to Ax = λx, so 0 ∈ Nλ, so Nλ is non-empty.
Take x,y ∈ Nλ, α ∈ R.
(i) Then
A(x + y) = Ax + Ay
= λx + λy
= λ(x + y)
⇒ x + y ∈ Nλ.
(ii)
A(αx) = αAx
= αλx
= λ(αx)
⇒ αx ∈ Nλ.
Hence the set of solutions to Ax = λx forms a vector space.
This is why when we found eigenvectors previously, we actually found infinitely many eigenvectors foreach λ.
17.7 The span of a set of vectors 195
17.7 The span of a set of vectors
We would like a more efficient way to describe all the vectors in a vector space V , and a way of generatinga vector space from a given set of vectors.
17.7.1 Definition: Span
The span of vectors v1, v2, ..., vm ∈ Rn is the set of all possible linear combinations of these vectors:
span{v1, v2, ..., vm} ={α1v1 + α2v2 + · · ·+ αmvm | α1, α2, . . . , αm ∈ R
}.
Note that the span is a set of vectors.
17.7.2 Example
Show that v =
(35
)is in span
{(11
),
(1−1
)}.
We want
(35
)= α
(11
)+ β
(1−1
).
Solving, α = 4, β = −1. We want
(35
)= α
(11
)+ β
(1−1
).
Solving, α = 4, β = −1.
17.7.3 Example
Show that span
{(11
),
(1−1
)}= R2.
Let v =
(v1v2
)∈ R2 be arbitrary. We need to find α1 and α2 ∈ R such that
v = α1
(11
)+ α2
(1−1
)=
(α1 + α2
α1 − α2
).
So we need:{v1 = α1 + α2
v2 = α1 − α2
⇒
α1 =
1
2(v1 + v2)
α2 =1
2(v1 − v2)
.
Hence
v =
(v1v2
)=
1
2
(v1 + v2
)(11
)+
1
2
(v1 − v2
)( 1−1
).
So any vector in R2 can be written as a combination of
(11
)and
(1−1
).
The previous example was v =
(35
), so v1 = 3, v2 = 5 so 1
2(v1 + v2) = 4, 12(v1 − v2) = −1. Let
v =
(v1v2
)∈ R2 be arbitrary. We need to find α1 and α2 ∈ R such that
v = α1
(11
)+ α2
(1−1
)=
(α1 + α2α1 − α2
).
So we need:v1 = α1 + α2
v2 = α1 − α2
⇒
α1 =
1
2(v1 + v2)
α2 =1
2(v1 − v2)
.
Hence
v =
(v1v2
)=
1
2
(v1 + v2
) (11
)+
1
2
(v1 − v2
) ( 1−1
).
So any vector in R2 can be written as a combination of
(11
)and
(1−1
).
The previous example was v =
(35
), so v1 = 3, v2 = 5 so 1
2(v1 + v2) = 4, 1
2(v1 − v2) = −1.
17.7 The span of a set of vectors 196
17.7.4 Example
Show that span
1
00
,
110
,
111
= R3.
Let v =
abc
∈ R3 be arbitrary. We need to find α, β, γ ∈ R such that
v = α
100
+ β
110
+ γ
111
.
That is,α + β + γ = a
+ β + γ = bγ = c
Back substitution gives γ = c, β = b− c, α = a− b. So
v =
abc
= (a− b)
100
+ (b− c)
110
+ c
111
.
Let v =
abc
∈ R3 be arbitrary. We need to find α, β, γ ∈ R such that
v = α
100
+ β
110
+ γ
111
.
That is,α + β + γ = a
+ β + γ = bγ = c
Back substitution gives γ = c, β = b− c, α = a− b. So
v =
abc
= (a− b)
100
+ (b− c)
110
+ c
111
.
17.7.5 The Span of vectors is a Vector Space
The span of any non-empty set of vectors is a vector space.
We proved this for two vectors in Example 17.5.7. The general proof is similar.
If span{v1, v2, ..., vm} = V then we say that v1, . . . ,vm span V , or that they form a spanning set forV .
17.7.6 Example
Find a spanning set for V =
{α
(12
)| α ∈ R
}.
17.8 Bases 197
Every vector in V is a linear combination of
(12
), so a spanning set is
{(12
)}.
This answer is not unique. For example
{(−1−2
)}is another spanning set for V . Every vector in V is a linear
combination of
(12
), so a spanning set is
{(12
)}.
This answer is not unique. For example
{(−1−2
)}is another spanning set for V .
17.7.7 Example
The coordinate vectors ei =
0...010...0
← i span Rn since every vector v ∈ Rn is a linear combination of
these vectors (see equation 17.1).
17.7.8 Example
The vector space {0} is spanned by the zero vector 0 ∈ Rn and is called the zero vector space.
17.8 Bases
We have seen that we can describe a vector space by giving a spanning set. When we do this, we usuallylike to choose a spanning set that is as small as possible (contains no redundancies). We ensure this asfollows.
17.8.1 Definition: Basis
Let V be a vector space. We say that a set of vectors B = {v1, ..., vm} ⊆ V is a basis for V if:
1. B spans V ;
2. B is linearly independent.
To prove that a set of vectors is a basis, we must prove both properties.
Sometimes we are just given a vector space V , and asked to find a basis. Problems like this are solvedby finding a spanning set for V , and then checking that the set is linearly independent.
A single vector space V may have many different bases.
17.8.2 Example
We have seen that the coordinate vectors ei, i = 1, ..., n, are linearly independent (section 17.3.2) andspan Rn (section 17.7.7). Therefore they form a basis for Rn, called the standard basis.
17.9 Dimension 198
17.8.3 Example
Show that the vectors
(11
),
(1−1
)form a basis for R2.
We must check: (i) linear independence; (ii) the set spans R2.
(i) was example 17.3.1.
(ii) was example 17.7.3. We must check: (i) linear independence; (ii) the set spans R2.
(i) was example 17.3.1.
(ii) was example 17.7.3.
Note that this basis differs from the standard
{(10
),
(01
)}basis of R2, but still has two elements.
17.8.4 Example
Find a basis for the vector space of 17.5.2
V =
a
bc
∣∣∣∣ a+ b = 2c
, a, b, c ∈ R
Any vector in V is of the form
v =
2c− bbc
=
2c0c
+
−bb0
= cv1 + bv2
with c, b ∈ R. Thus {v1, v2} is a spanning set for V : V = span{v1,v2}.For {v1, v2} to be a basis, we need to show v1 and v2 are linearly independent. If α1v1 +α2v2 = 0 thenfrom the second coordinate α2 = 0 and from the first α1 = 0, so {v1,v2} is also linearly independent.Any vector in V is of the form
v =
2c− bbc
=
2c0c
+
−bb0
= cv1 + bv2
with c, b ∈ R. Thus {v1, v2} is a spanning set for V : V = span{v1,v2}.
For {v1, v2} to be a basis, we need to show v1 and v2 are linearly independent. If α1v1 + α2v2 = 0 then from the second coordinate α2 = 0 and from
the first α1 = 0, so {v1,v2} is also linearly independent.
17.9 Dimension
Suppose B = {v1, ..., vm} is a basis for V . Then it can be shown that every basis for V has m vectors.We call m the dimension of V and write m = dimV . (By convention, the zero vector space {0} hasdimension zero.)
17.10 Components 199
17.9.1 Example
We have seen that the coordinate vectors ei, form a basis for Rn, so
dimRn = n.
17.9.2 Example
The vector space
V =
{α
(12
)| α ∈ R
}⊆ R2
is spanned by a single vector
(12
)(example 17.7.6) which thus constitutes a basis for V . Hence
dimV = 1.
Geometrically, V consists of vectors along a single line, which is a one dimensional object.
17.9.3 Example
Consider the vector space of 17.5.2
V =
a
bc
∣∣∣∣ a+ b = 2c
, a, b, c ∈ R
In example 17.8.4 we found a basis with 2 vectors, so dimV = 2. Note that V is a subspace of R3, butit consists of a plane through the origin, which geometrically is a two dimensional object.
17.10 Components
Suppose B = {v1, . . . ,vm} is a basis for V . Then every v ∈ V may be written as a linear combinationof elements in B
v = α1v1 + · · · · · ·+ αmvm
and this representation is unique: if we also have
v = β1v1 + · · · · · ·+ βmvm
then α1 = β1, . . . , αm = βm.
Proof:
17.11 Further properties of bases 200
Since B is a basis, it spans V .
Hence every v can be written as some linear combination of v1, . . . ,vm.
We need to prove uniqueness. But
0 = v − v = (α1v1 + · · ·+ αmvm)− (β1v1 + · · ·+ βmvm)
= (α1 − β1)v1 + · · ·+ (αm − βm)vm.
Since B is linearly independent, αi − βi = 0, i = 1, ..., m.So αi = βi meaning that the representation is unique.Since B is a basis, it spans V .
Hence every v can be written as some linear combination of v1, . . . ,vm.
We need to prove uniqueness. But
0 = v − v = (α1v1 + · · · + αmvm)− (β1v1 + · · · + βmvm)
= (α1 − β1)v1 + · · · + (αm − βm)vm.
Since B is linearly independent, αi − βi = 0, i = 1, ..., m.
So αi = βi meaning that the representation is unique.
If v = α1v1 + · · ·+ αmvm we call αi the ith component of v with respect to the basis B.
17.11 Further properties of bases
With more work, one can also prove the following:
(i) Every vector space has a basis.
(ii) If m = dimV , any m linearly independent vectors in V will form a basis.
(iii) If m = dimV , any set of more than m vectors in V will be linearly dependent (and so not be abasis).
(iv) If m = dimV , any set of fewer than m vectors in V will not span V (and so not be a basis).
(v) The only m-dimensional subspace of Rm is Rm itself.
(vi) If W is a subspace of V , then dimW ≤ dimV and dimW = dimV if and only if V = W .
17.11.1 Example
Show that in R3, the vectors
v1 =
100
, v2 =
110
v3 =
111
are linearly independent and hence form a basis for R3.
17.11 Further properties of bases 201
Setting αv1 + βv2 + γv3 = 0 implies
α
100
+ β
110
+ γ
111
=
000
⇒
α+ β + γβ + γγ
=
000
⇒ γ = 0⇒ β = 0⇒ α = 0
Therefore the vectors are linearly independent.Because we already know dimR3 = 3 and we have found 3 linearly independent vectors, they must forma basis, by (vi) above.(Alternatively: we proved they span in example 17.7.4.) Setting αv1 + βv2 + γv3 = 0 implies
α
100
+ β
110
+ γ
111
=
000
⇒
α + β + γβ + γγ
=
000
⇒ γ = 0⇒ β = 0⇒ α = 0
Therefore the vectors are linearly independent.Because we already know dimR3 = 3 and we have found 3 linearly independent vectors, they must form a basis, by (vi) above.
(Alternatively: we proved they span in example 17.7.4.)
The shortcut in this example is fine if we already know the dimension of V . If we are just given a spacelike
V =
a
bc
∣∣∣∣ a+ b = 2c
, a, b, c ∈ R
it may not be obvious what the dimension is.
17.11 Further properties of bases 202
Notes
17.11 Further properties of bases 203
Notes
17.11 Further properties of bases 204
Notes
18.1 Review 205
18 Review
It is assumed that you are familiar with the material in the following sections.
• A vector quantity has both a magnitude and a direction. Force and velocity are examples of vectorquantities.
• A scalar quantity has only a magnitude (it has no direction). Time, area and temperature areexamples of scalar quantities.
18.1 Review
A vector is represented geometrically in the (x, y) plane (or in (x, y, z) space) by a directed line segment(arrow). The direction of the arrow is the direction of the vector, and the length of the arrow is propor-tional to the magnitude of the vector. Only the length and direction of the arrow are significant: it canbe placed anywhere convenient in the (x, y) plane (or (x, y, z) space).
x
y
Figure 40: The vector of unit length at 45◦ to the x-axis has many representations.
21v=<v ,v >
2
21
1x=v
(v ,v )y=v
2
1 v
= v
=
P
PQ
Q
PP
y −y
x −x
Q=(x ,y )
P=(x ,y )
PQ
(a) (b)
Figure 41: Geometric representation of a vector.
If P , Q are points,−→PQ denotes the vector from P to Q.
18.4 Vector Addition 206
A vector v =−→PQ in the (x, y) plane may be represented by a pair of numbers
v =
(v1v2
)=
(xQ − xPyQ − yP
)which is the same for all representations
−→PQ of v. We call v1, v2 the components of the vector v.
We call the vector
(00
)the zero vector. It is denoted by 0.
Vectors will be indicated by bold lowercase letters v, w etc. Writing by hand you may use v or ~v or v.
18.2 Position Vectors
Let P = (xp, yp) be a point in the (x, y) plane. The vector−→OP , where O is the origin, is called the
position vector of P . Obviously−→OP=
(xpyp
).
18.3 Definition: Norm
For vector v =−→PQ, the norm (or length or magnitude ) of v, written ‖v‖, is the distance between P and
Q. Thus for v =
(v1v2
)we have
‖v‖ =√v21 + v22
‖v‖ =√v21 + v22
18.4 Vector Addition
We add vectors by the triangle rule.
Consider the triangle PQR with v =−→PQ, w =
−→QR. Then v + w =
−→PR; see Figure 42. In terms of
components, if
v =
(v1v2
), w =
(w1
w2
)then
v + w =
(v1 + w1
v2 + w2
).
It follows from the component description that vector addition satisfies the following properties:
v + w = w + v (commutative law)
u + (v + w) = (u + v) + w (associative law)
v + 0 = 0 + v = v
18.6 Unit Vectors 207
Figure 42:−→PQ +
−→QR=
−→PR
α
2
1
2
1
v
v
v
v
α
αv
v
Figure 43: We can multiply the vector v by a number α (scalar).
18.5 Scalar multiplication
If α is a real number (called a scalar), we define αv to be the vector of norm
‖αv‖ = |α| · ‖v‖
in the same direction as v if α > 0, and opposite direction if α < 0.
Using similar triangles it follows that if v =
(v1v2
)then αv =
(αv1αv2
).
If we multiply any vector v =
(v1v2
)by zero we obtain the zero vector:
0 · v =
(00
)= 0
18.6 Unit Vectors
A unit vector is a vector of norm 1. If v 6= 0 is a vector, then
18.7 Vectors in 3 dimensions 208
1
‖v‖v
1
‖v‖v
is a unit vector in the direction of v.
In particular
i =
(10
), j =
(01
)determine unit vectors along the x and y axes respectively.
x
y
ij
Figure 44: Unit vectors i and j in the x and y directions respectively.
For any vector v =
(v1v2
)we have
v =
(v10
)+
(0v2
)= v1i + v2j,
Hence we can decompose v into a vector v1i along the x-axis and v2j along the y-axis. The numbers v1and v2 are called the components of v with respect to i and j.
18.7 Vectors in 3 dimensions
Similarly in (x, y, z)-space a vector v =−→PQ is represented in component form by
v =
v1v2v3
=
xQ − xPyQ − yPzQ − zP
which is the same for all representations
−→PQ of v.
For v =−→OP=
v1v2v3
the norm of the vector v is
18.7 Vectors in 3 dimensions 209
P
N
3
2
1
v
v
v
z
y
x
v
‖v‖ =√ON2 +NP 2 =
√v21 + v22 + v23.
‖v‖ =√ON2 +NP2 =
√v21 + v22 + v23 .
As before, we add vectors component by component. We also define multiplication by a scalar α. So if
v =
v1v2v3
, w =
w1
w2
w3
are vectors then
v + w =
v1 + w1
v2 + w2
v3 + w3
and αv =
αv1αv2αv3
, for α ∈ R.
v + w =
v1 + w1v2 + w2v3 + w3
and αv =
αv1αv2αv3
, for α ∈ R.
The unit vectors along the x, y, z axes are respectively
i =
100
, j =
010
, k =
001
.
Any vector v =
v1v2v3
may be expressed as
v = v1i + v2j + v3k.
18.9 Dot Product 210
We call v1, v2, v3 the components of v in the i, j, k directions respectively.
We usually denote 2 and 3 dimensional space by R2 and R3, respectively. Thus
R2 =
{(xy
)| x, y ∈ R
}and R3 =
x
yz
| x, y, z ∈ R
.
18.8 Row and column vectors
We may write vectors using columns eg.
(ab
), or as row vectors
(a b
). We usually use column
vectors in this course.
18.9 Dot Product
For non zero vectors v =−→OP , w =
−→OQ the angle between v and w is the angle θ with 0 ≤ θ ≤ π radians
between−→OP and
−→OQ at the origin O; see Figure below.
x
y
θ
vw
Figure 45: θ is the angle between v and w.
The dot (or scalar or inner) product of vectors v and w, denoted by v ·w, is the number given by
v ·w =
0 , if v or w = 0
‖v‖ · ‖w‖ cos θ , otherwise
where θ is the angle between v and w.
If v, w 6= 0 and v ·w = 0 then v and w are said to be orthogonal or perpendicular.
If v = v1i + v2j + v3k and w = w1i + w2j + w3k are two vectors, then v ·w is given by:
v ·w = v1w1 + v2w2 + v3w3,
In particular, for v ∈ R3,‖v‖2 = v · v = v21 + v22 + v23.
18.9 Dot Product 211
18.9.1 Example
Find the angle θ between the vectors:
v =
1−5
4
, w =
333
cos θ =v ·w‖v‖ · ‖w‖
.
v ·w = 1 · 3− 5 · 3 + 4 · 3 = 0.
So cos θ = 0, so the vectors are perpendicular, i.e. θ =π
2.
cos θ =v ·w
‖v‖ · ‖w‖.
v ·w = 1 · 3− 5 · 3 + 4 · 3 = 0.
So cos θ = 0, so the vectors are perpendicular, i.e. θ =π
2.
18.10 Trigonometric functions (sin, cos, tan) 212
18.10 Trigonometric functions (sin, cos, tan)
In calculus, we measure angles in radians. One radian is defined as the angle subtended at the centre ofa circle of radius 1 by a segment of arc length 1.
1
1
1
1 rad
Figure 46: Angle in radians.
By definition, 2π radians = 360◦. To convert between the two measures use the following formulae:For x angle in radians, θ angle in degrees,
x× 360◦
2π= θ, θ × 2π
360◦= x.
The point on the unit circle x2 +y2 = 1 making an angle θ (in radians) anti-clockwise from the x-axis hascoordinates (cos θ, sin θ), so cos2 θ + sin2 θ = 1. This defines the cos and sin functions. This relationshipis shown in Figure 47.
x
y
θ
(cos , sin )θ θ
sin
cos
θ
θ 1
1
−1
−1
Figure 47: The relationship between the cosine and sine functions.
18.10 Trigonometric functions (sin, cos, tan) 213
The graphs of the functions cos θ and sin θ are shown below in Figure 48. Note how the two functionshave the same behaviour, but are shifted (i.e. phase shifted)) by π
2 . This leads to the relationship
cos θ = sin(π
2− θ)
= sin(θ +
π
2
).
−6 −4 −2 0 2 4 6
−1
−0.5
0
0.5
1
x
cos(x), sin(x)
cos(x)sin(x)
Figure 48: The relationship between the cosine and sine functions.
In both cases in Figure 48 the range is [−1, 1]. There are some unmarked vertical dotted lines. Whatvalues do they represent? (This graph was made using MATLAB. To get both curves on the same axes,we used the hold on command.)
The tan function is related to the sine and cosine functions by
It is defined for cos θ 6= 0. It is therefore not defined at values of θ which are odd integer multiples of π2
(eg ±π2 , ±3π
2 , ±5π2 etc).
The graph of tan θ is given in Figure 49 below. Mark the x values in the graph where the function is notdefined.
−6 −4 −2 0 2 4 6
−6
−4
−2
0
2
4
6
x
tan(x)
Figure 49: Graph of tan θ.
18.10 Trigonometric functions (sin, cos, tan) 214
Figure 50 shows another common function,
cotx =cosx
sinx.
−6 −4 −2 0 2 4 6−6
−4
−2
0
2
4
6
x
cot (x)
Figure 50: Graph of cot θ.
The periodic nature of all the trigonometric functions we have seen makes them ideal for modellingrepetitive phenomena such as tides, vibrating strings, and various types of natural wave-like behaviour.
Both the sine and cosine functions have period 2π. Hence
sin(x+ 2π) = sinx, and
cos(x+ 2π) = cosx.
Furthermore, both sin(x) and cos(x) have an amplitude of one.
We can “speed up” or “slow down” these functions by tinkering with their periodicity. For example,compare the functions sin(x) and sin
(x2
)in Figure 51. Compare also the functions sin(x) and sin(2x) in
Figure 52.
−8 −6 −4 −2 0 2 4 6 8
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
x
sinx, sin(x/2)
sin(x)sin(x/2)
Figure 51: Graph of sin(x) and sin(x2
).
18.10 Trigonometric functions (sin, cos, tan) 215
−8 −6 −4 −2 0 2 4 6 8
−1
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
x
sinx, sin(2 x)
sin(x)sin(2x)
Figure 52: Graph of sin(x) and sin(2x).
In addition, we can stretch or shrink trigonometric functions by multiplying these functions by constantsother than one. For example, Figure 53 shows the functions 1
2 cos(x) and 3 cos(x). What amplitudes dothese have?
−8 −6 −4 −2 0 2 4 6 8
−3
−2
−1
0
1
2
3
x
1/2cos(x), 3 cos(x)
1/2 cos(x)3cos(x)
Figure 53: Graph of 12 cosx and 3 cos(x).
Naturally, we can change both the period and amplitude of trigonometric functions simultaneously. Figure54 show the graphs of 4 cos
(x3
)and 2 sin(5x).
−20 −15 −10 −5 0 5 10 15 20−5
−4
−3
−2
−1
0
1
2
3
4
5
x
4cos(x/3) and 2sin(5x)
4cos(x/3)2sin(5x)
Figure 54: Graphs of 4 cos(x3
)and 2 sin(5x).
18.10 Trigonometric functions (sin, cos, tan) 216
18.10.1 Example
Suppose that the temperature in Brisbane is periodic, peaking at 32 degrees Celsius on January 1, andsinking to 8 degrees Celsius in June; see Figure 55. Use the cosine function to model temperature as afunction of time (in days).
0 50 100 150 200 250 300 3500
5
10
15
20
25
30
35
time, (t)
tem
pera
ture
(T)
Figure 55: Graph of temperature fluctuations in Brisbane.
19.2 Functions 217
19 Appendix - Practice Problems
You should attempt the practice problems in this section while learning the material in the course. Try tosolve all questions. Do at least questions marked with a star (*) before attempting assignment questions.Have a look at the questions before your tutorial. Try to solve them yourself first, and ask your tutor atthe tutorial if you need help. Solutions to questions on this sheet are not to be handed in.
19.1 Numbers
1. Solve for x.
(a) |x+ 1| < 1
(b) |x+ 1| > 1
(c) |3x+ 5| ≤ 2
2. Simplify.
(a) 3+i3−i
(b) (2+3i)(3+2i)1+i
3. Write z = 1−√
3i in polar form.
4. Find the roots of x2 − 4x+ 5 = 0.
19.2 Functions
1. Graph the following functions and give their domain and range.
(a) f(x) = x2 + 2∗(b) f(x) = 1
x−2∗(c) f(x) = − 1
x−2
(d) f(x) = 1x2+2
∗(e) f(x) = x+2x2−4
∗(f) f(x) =√x2 − 9
(g) f(x) = −√x2 + 9
(h) f(x) = ex+1
(i) f(x) = 1 + e2x+3.
2. Determine whether the graphs below represent graphs of functions. If it is state the range anddomain of the function.
3. Find the domain and sketch the graph of the function
g(x) =|x|x2.
4. Given
f(x) =
{2x+ 3, x < −1
3− x, x > −1
State the domain and range of f .
19.3 Limits 218
5. Two species of kangaroos inhabit a national park. There are currently 4,500 grey kangaroos, whosepopulation increases at a rate of 4% a year. There are only 3,000 red kangaroos, but they increaseat a rate of 6% a year. In how many years will the red kangaroos outnumber the grey kangaroos?Use logarithms to find the answer and draw a graph of the two populations.
6. Let P (t) be the amount of a radioactive material at time t. Experiments indicate P (t) decaysexponentially with time, so
P (t) = P0e−kt
where P0 is the amount of material at time t = 0 (initial amount) and k is a positive constant(called the decay constant). The half-life T is the time taken for the material to decay to one halfof its initial amount. Obtain T in terms of k.
7. If y = tan(x), 0 ≤ x < π/2, find cos(x) as a function of y.
∗ 8. Find the inverse of f(x) = 13 cos(2x) and state its domain and range.
9. For f(x) = 3x+ 2, g(x) = ln(x2 + 1) and h(x) = 3e2x+1, find
(a) f(g(x))
(b) g(f(x))
(c) f(h(x))
(d) h(f(x))
(e) g(h(x))
(f) f(g(h(x)))
(g) h(g(f(x))).
10. Show from the definition that the following two functions are 1-1 on their domains (a graph is notsufficient here). Find the inverse function, and state the domain and range of both f and f−1.
(a) f(x) = e3x+1
∗ (b) f(x) = x+1x+2 .
19.3 Limits
1. Let
f(x) =x− 3
x2 − 9
Use the limit laws to find L = limx→3 f(x).
(a) Find a small interval around 3 where |f(x)− L| ≤ .01
∗ 2. Prove that limx→0
x4 cos 2x+13x2+5
= 0.
3. Evaluate the following limits
∗ (a) limx→2
x2+2x−8x2−5x+6
(b) limx→−1
x2−x−12x+3
(c) limx→−1
x+1x2−x−2
(d) limt→1
t3−tt2−1
(e) limx→0
√x+2−
√2
x (Hint: multiply top and bottom by the conjugate surd√x+ 2 +
√2).
19.5 Derivatives 219
4. Show
limx→3
x
5=
3
5
using the ε, δ definition of the limit.
5. Prove that
limx→0
x sin(1
x) = 0.
6. Evaluate the following limits.
(a) limx→∞
3x2+2x+1x2+x+3
∗ (b) limx→∞
(√x+ 1−
√x) (Hint: multiply top and bottom by the conjugate surd (
√x+ 1 +
√x) )
(c) limx→∞
x2−1x3+2
(d) limx→∞
x−x3
x2+1
(e) limx→−∞
x+11−x .
19.4 Continuity
1. Determine whether or not f(x) is continuous at x = 1 and sketch the graph of f(x) for the followingchoices of f(x):
(a) f(x) =
{x2 − 2x+ 2 , if x < 1
3− x , if x ≥ 1
(b) f(x) =
{x3 , if x < 1
(x− 2)2 , if x ≥ 1
(c) f(x) =
{ x−1x2−1 , if x 6= 1
6 , if x = 1.
∗ 2. For what value of a is f(x) continuous for all x, where
f(x) =
{x2−3x+2x−1 , x 6= 1
a, x = 1
Give clear explanations.
3. Is the following function continuous at x = −5?
f(x) =
{ x+5x2−25 , x 6= −5
− 110 , x = −5
19.5 Derivatives
1. (a) Sketch the graph of f(x) = x|x|.(b) For which values of x is f differentiable?
(c) Obtain a formula for f ′(x).
2. The hyperbolic sine and cosine functions are defined by
sinhx =1
2(ex − e−x), coshx =
1
2(ex + e−x)
respectively.
19.5 Derivatives 220
(a) Prove cosh2 x− sinh2 x = 1.
(b) Prove ddx sinhx = coshx, d
dx coshx = sinhx.
(c) Find limx→0
sinhxx .
3. The Tasmanian Devil often preys on animals larger than itself. Researchers have identified themathematical relationship
L(w) = 2.265w2.543,
where w (in kg) is the weight and L(w) (in mm) is the length of the Tasmanian Devil. Supposethat the weight of a particular Tasmanian Devil can be estimated by the function
w(t) = 0.125 + 0.18t,
where w(t) is the weight (in kg) and t is the age, in weeks, of a Tasmanian Devil that is less thanone year old. How fast is the length of a 30-week-old Tasmanian Devil changing?
4. Show that the equation of the tangent line to a point (x0, y0) on the ellipse x2
a2 + y2
b2= 1 is given by
x0x
a2+y0y
b2= 1.
∗ 5. Consider the function f(x) = xx. Evaluate f ′(x) for x > 0.
6. A new car costing P dollars depreciates at an annual rate of r%, so that after t years the car willbe worth C dollars where
C(t) = P (1− r
100)t.
(a) Find dCdt
(b) How much will the car be worth in the long run?
∗ 7. Findd
dθ
θ
cos(sin θ).
8. A spherical snowball is melting in such a way that its volume is decreasing at a rate of 1 cm3/min.At what rate is the diameter decreasing when the diameter is 10 cm?
9. Using L’Hopital’s rule evaluate the following limits
∗(a) limθ→0
1−cos θθ2
(b) limx→0
ex−1sinx
(c) limx→0
sinx−xx3
∗(d) limx→1
( 1lnx −
1x−1)
(e) limx→∞
e−x lnx
(f) limθ→0
(sin θ)n
sin(θn) , 1 ≤ n ∈ N.
10. Use the mean value theorem to prove the following (this is the second derivative test).Suppose f : [a, b]→ R is continuous, and has continuous first and second derivatives on [a, b]. Thenif c ∈ (a, b) is a point with f ′(c) = 0 and f ′′(c) > 0 then f has a local minimum at c.
11. Determine where the following functions are increasing/decreasing.
(a) f(x) = (x− 1)(x+ 2) = x2 + x− 2
(b) f(x) = x|x|.
19.6 Integration 221
(c) f(x) = sinx, x ∈ (0, π)
(d) f(x) = e−x+1ex
12. Prove the following:ex ≥ x+ 1, for all x ≥ 0.
Hint: Consider f(x) = ex − x− 1.
13. Prove the following:sinx ≤ x, for all x ≥ 0.
14. Find the absolute maximum and minimum of the function f(x) = x3−3x+2 on the closed interval[0, 2].
15. Find all relative maxima and minima and thus sketch the graph of the function
f(x) = x+1
x.
∗ 16. Show that the function
f(x) =
{x lnx , x > 0
0 , x = 0
is continuous on [0,∞). What are the absolute maximum and minimum values of f(x) on [0, 2]?
17. To increase the velocity of the air flowing through the trachea when a human coughs, the bodycontracts the windpipe, producing a more effective cough. Tuchinsky formulated that the velocityof air that is flowing through the trachea during a cough is
V = C(R0 −R)R2,
where C > 0 is a constant based on individual body characteristics, R0 is the radius of the windpipebefore the cough, and R is the radius of the windpipe during the cough. Find the value of R thatmaximizes the velocity. (Interestingly, Tuchinsky also states that X-rays indicate that the bodynaturally contracts the windpipe to this radius during a cough.)
19.6 Integration
1. Evaluate
(a)∫
(x3 − 2x+ 1)dx
(b)∫x−adx, a > 0.
(c)∫
sin(x+ π3 + 2x)dx
(d)∫
(x+ 1x)2dx
2. Let f be continuous on an interval I. Let F1 and F2 be two different anti-derivatives of f . Showthat F1 and F2 differ by a constant only. (Hint: Recall that if f ′(x) = 0 on an interval I then f isconstant on this interval.)
3. Estimate the area under the below function using:
(a) the right rectangle rule (Riemann sums) with 2 subintervals, then 4 subintervals then 8 subin-tervals.
(b) the trapezoidal rule with 4 subintervals.4. What is wrong with the following argument.
19.6 Integration 222
1
1
y = f(x)
F (x) = −1/(x− 1) is an anti derivative of the function f(x) = 1/(x− 1)2. Thus by thefundamental theorem, the area under f on the interval [0, 3] is given by∫ 3
0
1
(x− 1)2dx =
−1
(x− 1)
∣∣∣∣30
=−1
2− −1
−1= −3
2.
Does this show a positive function can have a negative area?
∗ 5. Find the area enclosed by the graphs of the functions f(x) = x2 and g(x) = x.
6. Derive a formula for the area of the ellipse
x2
a2+y2
b2= 1.
7. Find the area under the graph of
(a) y =√
4− x2 above [0, 1]
(b) y = secx tanx above [0, π3 ].
8. Evaluate each of the following integrals by interpreting it in terms of areas:
(a)3∫1
(1 + 2x)dx (b)0∫−3
(1 +√
9− x2)dx (c)2∫−2|x|dx (d)
2∫−2
√4− x2dx.
9. Determine∞∫0
xe−x2dx.
10. By finding a suitable substitution, evaluate the following indefinite integrals:
∗ (a)∫
ex
1+e2xdx
(b)∫
dxx2+4x+9
dx
(c)∫
cosx sin3 xdx.
11. Use integration by parts to find:
∗ (a)∫x cosxdx
(b)∫
cos2 xdx
(c)∫ex cosxdx
(d)∫
lnxdx (e)∫x sec2 xdx
19.8 Series 223
12. Use partial fractions to find
∗ (a)∫
x(x+1)(x+2)dx
(b)∫
x+3x(x+2)dx
(c)∫
4x+8x2+2x−3dx.
13. Evaluate the following integrals:
(a)∫
xx2−1dx (b)
∫1+ex
1−exdx.
14. Evaluate the following indefinite integrals
(a)∫ sin(1/t2)
t3dt
(b)∫
sin(x) cos(cos(x)) dx
Check by explicitly differentating your answer to confirm it is indeed an anti-derivative.
15. Find the volume of the ellipsoid obtained by rotating the ellipse x2
a2 + y2
b2= 1 about the x-axis.
16. Let a > 0 be constant. Find the volume V of the solid of revolution obtained by rotating y = f(x)above the interval [0, a] about the x-axis, for the following choices of f(x):
(a) y = x2
∗ (b) y =√
ln(x+ 1).
17. Use Euler’s formula to express cos(2θ) and sin(2θ) in terms of cos(θ) and sin(θ).
19.7 Sequences
1. Use L’Hopital’s rule to find
limn→∞
sin(1/n)
log(1 + 1/n).
2. Evaluate the limit of the sequence {an}∞n=1 for the following choices of an:
(a) an = ncn, 0 ≤ c < 1∗ (b) an = 1+2n
1+3n .
∗ 3. Prove thatlimn→∞
n1/n = 1.
19.8 Series
∗ 1. It is known that a repeated decimal can always be expressed as a fraction. Consider, for example,the decimal 0.727272 . . . . Use the fact that
0.727272 · · · = 0.72 + 0.0072 + 0.000072 + . . .
to express the decimal as a geometric series. Hence show that
0.727272 · · · = 8
11.
2. Determine whether or not the following series converge:
(a)∞∑n=1
nn
n!
19.8 Series 224
(b) 11·2 + 1
2·3 + 13·4 + . . .
3. (a) Prove that∞∑n=1
nxn = x(1−x)2 , for |x| < 1.
(Hint: Consider x ddx( 1
1−x))
(b) Hence, or otherwise, evaluate∞∑n=1
n2n .
4. Consider the p = 1 case of the p-series (called the harmonic series):∞∑n=1
1
n= 1 + (
1
2) + (
1
3+
1
4) + (
1
5+
1
6+
1
7+
1
8) + (
1
9+
1
10+ · · ·+ 1
16) + . . .
and group together its terms as shown. Show that the sum of each group of fractions is more than12 . Hence deduce that the harmonic series diverges.
∗ 5. Evaluate the following series:
(a)∞∑n=1
( 1√n− 1√
n+1)
(b)∞∑n=0
(−1)nxn.
6. (a) For p > 0, show that f(x) = xp is strictly increasing on [0,∞).
(b) For 0 < p < 1, show that1
np≥ 1
n, n = 1, 2, 3 . . .
(c) Show that the p-series∞∑n=1
1
np
diverges for 0 ≤ p ≤ 1.
7. Determine whether or not the following series converge.
(a)∞∑n=1
(−1)nn3n
∗ (b)∞∑n=0
22n
(2n)!
∗ (c)∞∑n=1
ln(n)n! .
8. (a) Write down from the examples in lectures or the text book two series of positive terms∑an
&∑bn satisfying the following conditions:
limn→∞
∣∣∣∣an+1
an
∣∣∣∣ = 1 and the∑m
an converges.
limn→∞
∣∣∣∣bn+1
bn
∣∣∣∣ = 1 and∑n
bn diverges.
This shows the ratio test gives no information if the limit is 1. You can accept any proofs ofconvergence in lectures or the text book but prove both the limits above.
9. Find an example of a sequence an with an = f(n)/g(n) for differentiable functions f and g, wherelimn→∞ an and limn→∞ f
′(n)/g′(n) both exist, and
limn→∞
an 6= limn→∞
f ′(n)
g′(n).
This shows before using L’Hopital’s rule we must remember to check that either limn→∞ f(n) = 0= limn→∞ g(n) or limn→∞ f(n) = ±∞ = limn→∞ g(n)!
19.9 Power series and Taylor series 225
19.9 Power series and Taylor series
1. This question is about the MacLaurin series for
(1 + x)3/2 .
Find the first three terms of this series. If this series is written as∑∞
n=0 an, find the formula for an.Find a formula for cn so an+1 = cnan.
∗ 2. Obtain the MacLaurin series expansion for arctanx and its radius of convergence.(Hint: Use d
dx(arctanx) = 11+x2 ).
Using the fact that tan(π/6) = 1/√
3, show that
π = 2√
3∞∑n=0
(−1)n
(2n+ 1)3n.
∗ 3. Obtain the MacLaurin series expansions and radius of convergence for the following functions:
coshx =1
2(ex + e−x) , sinhx =
1
2(ex − e−x).
∗ 4. Show that for |x| < 1,
(a)√
1 + x = 1 +∞∑n=1
(−1)n−1(2n− 2)!
22n−1n!(n− 1)!xn.
(b)
1√1 + x
=∞∑n=1
(−1)n(2n)!
22n(n!)2xn.
5. The theory of relativity predicts that the mass m of an object when it is moving at a speed v isgiven by the formula
m =m0√
1− v2/c2
where c is the speed of light and m0 is the mass of the object when it is at rest. Express m asa MacLaurin series in v
c and determine for which values of v this series converges. (Hint: Use aresult from the previous question.)
6. Show that the functions
(a) f(x) = (1 + x2)e−x2
(b) f(x) = x2 − ln(1 + x2)
both satisfy f ′(x) = f ′′(x) = 0 at x = 0 (so the second derivative test fails here). Show, usingTaylor series, that (a) has a relative maximum and (b) a relative minimum at x = 0.
∗ 7. Find the sum of each of the following convergent series by recognizing it as a Taylor series evaluatedat a particular value of x.
(a) 1 + 21! + 4
2! + 83! + · · ·+ 2n
n! + . . .
(b) 1− 13! + 1
5! −17! + · · ·+ (−1)n
(2n+1)! + . . .
(c) 1 + 14 + (14)2 + (14)3 + · · ·+ (14)n + . . .
(d) 1− 1002! + 10000
4! + · · ·+ (−1)n·102n
(2n)! + . . .
19.10 Vectors 226
19.10 Vectors
1. A canoe travels at a speed 3km/h in a direction due west across a river flowing due north at 4km/h.Find the actual speed and direction of the canoe.
2. Draw a clock with vectors v1,v2, . . . ,v12 pointing from the centre to each of 1, 2, . . . , 12. Describeeach of the following vectors.
(a) v1 + · · ·+ v12.
(b) v1 + v2 + 3v3 + v4 + v5 + v6 + v7 + v8 + 3v9 + v10 + v11 + v12.
(c) v1 + v2 + v3 + v4 + v6 + v7 + v9 + v10 + v12.
3. Find the lengths of the sides of the triangle ABC and determine whether the triangle is a righttriangle.
(a) A = (4, 2, 0), B = (6, 6, 8), C = (10, 8, 6)
(b) A = (5, 5, 1), B = (6, 3, 2), C = (1, 4, 4).
4. (a) Suppose u and v are orthogonal vectors. Prove that‖u‖2 + ‖v‖2 = ‖u + v‖2. What geometric fact have you proved?
(b) Prove that ‖u + v‖2 + ‖u− v‖2 = 2(‖u‖2 + ‖v‖2). What geometric fact have you proved?
(c) Consider a triangle with side lengths a, b, c. Bisect side c, and draw a line connecting themidpoint to the opposite vertex. Let the length of this line be m. Using vectors prove thata2 + b2 = 2(m2 + (c/2)2). This is called Apollonius’ Theorem.
∗ 5. Find a non-zero vector in R3 orthogonal to
123
and
456
.
6. Given the point P = (2, −1, 4), find a point Q where→PQ has the same direction as v = 7i+6j−3k.
∗ 7. For what values of c is the angle between the vectors (1, 2, 1) and (1, 1, c) equal to 60◦? For whatvalues of c are they orthogonal?
8. Determine whether or not each of the following sets of vectors are linearly independent.
(a)
120
,
−111
,
−121
.
(b) i + j, i + j + k, i− j.
(c)
1000
,
1200
,
1210
,
4410
.
9. Prove that the vectors 101
,
10−1
,
110
form a basis for R3. What are the components of the vector
210
in this basis?
19.11 Matrix Algebra 227
10. Given
v1 =
1−5−3
,v2 =
−216
,v3 =
2−9h
,
for what values of h is v3 in span{v1,v2}, and for what values of h is {v1,v2,v3} linearly indepen-dent? Justify your answer.
∗ 11. Which of the following subsets of R3 are vector spaces? Find a basis in each case.
(a) {(a, b, a+ 2b) | a, b ∈ R}.(b) {(a, b, c)|a− b = 3c+ 1}.
12. Suppose S = {v1,v2, . . . ,vm} ⊂ Rn and let V be the set of all linear combinations of the elementsof S, so
V = {α1v1 + α2v2 + · · ·+ αmvm | α1, α2, . . . , αm ∈ R}.
Prove that V is a vector space.
13. If {a, b, c} forms a basis for a 3-dimensional vector space, could {a+b, b+c, c+a} also be a basis?Give your reasons.
∗ 14. Let u1 = (0, 0, 3), u2 = (−2,−1, 0), u3 = (1, 0, 0). Which of u1, u2, u3 are linear combinations ofv1 = (1, 2, 3), v2 = (4, 5, 6), v3 = (7, 8, 9)? Do {v1,v2,v3} form a basis of R3?
15. Determine if the following vectors are a basis for R3. Justify your answer. 01−2
,
5−7
4
,
635
16. Let v1 = (1, 2, 3),v2 = (4, 5, 6),v3 = (−2,−1, 0). Do {v1,v2,v3} form a basis of R3? Explain.
∗ 17. Let v1 = (1, 4, 6), v2 = (2, 0, 1), v3 = (−1, 12, 16). Let V = span{v1,v2,v3}. Find dimV .
19.11 Matrix Algebra
1. Let A =
(cos θ sin θ− sin θ cos θ
). Find simple expressions for A2 and A3. What do you notice?
2. Find 2× 2 matrices A, B such that (A+B)2 6= A2 + 2AB +B2.
∗ 3. Prove that (A+B)T = AT +BT for any m× n matrices A and B.
4. Let A =
(1 23 4
). Find all 2× 2 matrices B such that AB = BA.
5. Find a 2× 2 matrix A with no entry of A equal to 0 such that A2 = 0.
6. A matrix A is called idempotent if A2 = A.
(a) Find x so that A is idempotent:
A =
2/3 0 x0 1 0√23 0 1/3
19.11 Matrix Algebra 228
(b) What is the smallest (integer) index N so that BN = 0? A matrix with such a property iscalled nilpotent.
B =
0 1 0 00 0 2 10 0 0 10 0 0 0
.
7. Let C =
1 2 11 1 21 3 −1
. Show that C is invertible, ie show linear independence of a certain set of
vectors.
∗ 8. Let A =
4 -1 23 0 x8 -3 2
. For which value(s) of x is A not invertible?
9. Find the inverse matrix for:
(a)
(1 00 1
)(b)
(1 23 4
)
(c)
1 2 30 1 20 0 1
∗ 10. Let
A =
15
25
35
35
15
25
0 0 15
and B =
−1 2 x3 −1 y0 0 z
.
Find real numbers x, y and z such that B = A−1.
11. A swimming pool can be filled by 3 pipes A,B and C. Pipe A can fill the pool by itself in 8 hours.Pipes A and C used together can fill the pool in 6 hours and pipes B and C used together can fillthe pool in 10 hours. How long does it take to fill the pool if all 3 pipes are used?
∗ 12. Use Gaussian elimination to solve the following system of equations, and check your answer bysubstitution.
x1 − 2x2 + 3x3 = −6
4x1 + 2x3 = 2
3x1 + x2 + 2x3 = 3
13. Let xp be a particular solution of Ax = b. Show that every solution of this equation is of the formx = xp + h where h is a solution of the homogeneous equation Ax = 0.
14. Show that the set of solutions ofAx = b, with b = 0
form a vector space for A.If b 6= 0, does the set of solutions still form a vector space?
15. Show that A =
(2 11 2
)satisfies A2−4A+ 3I = 0. Hence deduce that A is invertible with inverse
A−1 = 13(4I −A).
19.11 Matrix Algebra 229
16. Let a ∈ R. Use Gauss-Jordan elimination to find the inverses of the following matrices. Check bymultiplication if your inverse is correct.
∗ (a)
1 1 01 2 α2 1 1− α
(b)
0 0 −11 2 11 0 α
.
17. Find the inverses of the following matrices.
(a)
1 1 11 2 12 1 1
.
(b)
1 2 3 11 3 3 22 4 3 31 1 1 1
.
(c)
(1 11 2
).
18. Let A =
1 0 52 -1 23 1 2
. Calculate the determinant of A.
19. Determine for which values of x the following determinants vanish:
(a)
∣∣∣∣ 1− x 23 2− x
∣∣∣∣(b)
∣∣∣∣ 1 2x2 x
∣∣∣∣∗ (c)
∣∣∣∣∣∣1 1 42 x 1
2x x2 x
∣∣∣∣∣∣.20. Evaluate the following determinants
(a)
∣∣∣∣ cos θ sin θ− sin θ cos θ
∣∣∣∣(b)
∣∣∣∣∣∣3 1 00 6 40 0 1
∣∣∣∣∣∣(c)
∣∣∣∣∣∣∣∣−1 1 3 43 2 4 −10 3 1 04 1 −1 −5
∣∣∣∣∣∣∣∣(d)
∣∣∣∣∣∣−1 1 22 1 −1−1 1 −1
∣∣∣∣∣∣∗ 21. Use determinants to establish the linear dependence or independence of the following row vectors:
(a) (4, 1, 4), (2, 0, 4), (−1, 2, 2)
19.11 Matrix Algebra 230
(b) (3,−3,−1), (6,−6, 0), (3,−3, 2)
(c) (1, 1, 1), (0, 1, 0), (0, 1,−1)
∗ 22. For an invertible matrix A, prove that An is also invertible, for all natural numbers n.
20 APPENDIX - MATHEMATICAL NOTATION 231
20 Appendix - Mathematical notation
> is strictly greater than e.g. 5 > 2< is strictly less than e.g. 2 < 5≥,≤ is greater/less than or equal to e.g. 2 ≤ 2, 4 ≥ 3∈ is an element of e.g. 3.5 ∈ R/∈ is not an element of e.g. 3.5 /∈ N⊆ is a subset of e.g. N ⊂ R{x|...} set of all x for which a condition is fulfilled e.g. {x ∈ R|x > 0}∀ for all e.g. ∀x ∈ R... continue the pattern e.g. 1, 2, 3, ..., 9, 10n! n factorial, n! = n · (n− 1) · ... · 3 · 2 · 1 e.g. 3! = 3 · 2 · 1 = 6⇒ implies, if ... then e.g. x = 2⇒ x2 = 4⇔ is equivalent to, if and only if e.g. x2 = 4⇔ x = 2 or x = −2
21.1 Powers 232
21 Appendix - Basics
This appendix contains some very important basic mathematical rules. Since this is highschool material,you must know and be able to apply all of them to successfully complete MATH1051. Go throughthe sections and check your knowledge - do some additional exercises from textbooks if you encounterproblems, or check the MATH1040/MATH1050 material on the course websites.
21.1 Powers
21.1.1 Notation (product of two real numbers)
The product of two real numbers a and b can be written in either of the following ways:
a× b = a · b = ab.
21.1.2 Notation (powers)
The product after multiplying the same real number a n-times is
aa · · · a︸ ︷︷ ︸n factors
= an.
Note that an = aan−1 = an−1a.Then a0 = 1 (for a 6= 0) and a1 = a, aa = a2 = aa1, aaa = a3 = aa2.We also have 0n = 0 for n 6= 0.
21.1.3 Power rules
Let a, b, c ∈ R. We set
a−b =1
ab.
Thenabac = ab+c, a > 0 and(
ab)c
= abc.
We also haveapq = q√ap,
with p, q ∈ Z and q > 0.These formulae also apply for some other values of a, p and q but care needs to be taken.For example, a = −1, p = 1. If q = 3, then (−1)
13 = −1.
But if a = 2, then (−1)12 = i, which is not a real but a complex number!
21.1.4 Examples
1. (a× a)× (a× a)× (a× a) =(a2)3
= a2×3 = a6
2. a2 × a3 = a2+3 = a5
3. a12 =√a
4. a13 = 3√a
21.3 Fractions 233
21.2 Multiplication/Addition of real numbers
21.2.1 Laws (multiplication and addition)
1. a+ b = b+ a and ab = ba (Commutative Law)
2. (a+ b) + c = a+ (b+ c) and (ab)c = a(bc) (Associative Law)
3. a(c+ d) = ac+ ad = (c+ d)a = ca+ da (Distributive Law)
21.2.2 Examples (multiplications of sums)
1. (a + b)(c + d) = ac + ad + bc + bd,
as (a+ b)(c+ d) = a(c+ d) + b(c+ d) = ac+ ad+ bc+ bd.
2. (a + b)2 = a2 + 2ab + b2
as (a+ b)2 = (a+ b)(a+ b) = a(a+ b) + b(a+ b)= aa+ ab+ ba+ bb = a2 + 2ab+ b2.
3. (a + b)3 = a3 + 3a2b + 3ab2 + b3
as (a+ b)3 = (a+ b)(a+ b)2 = (a+ b)(a2 + 2ab+ b2)= a(a2 + 2ab+ b2) + b(a2 + 2ab+ b2)= a3 + 3a2b+ 3ab2 + b3.
4. (a− b)2 = a2 − 2ab + b2
5. (a + b)(a− b) = a2 − b2
21.3 Fractions
21.3.1 Rule (multiplying and dividing two fractions)
Let a, b, c, d ∈ Z. Thena
b· cd
=ac
bd
andabcd
=a
b· dc
=ad
bc.
21.3.2 Rule (adding two fractions)
Let a, b, c, d ∈ Z. Thena
c+b
c=a+ b
c.
If the two fractions have different denominator, a common denominator needs to be found:
a
b+c
d=ad+ bc
bd.
21.4 Solving quadratic equations 234
21.3.3 Examples
1.x2 + 1
x=x2
x+
1
x= x+
1
x.
2.x
x+ 1=
(x+ 1)− 1
x+ 1=x+ 1
x+ 1− 1
x+ 1= 1− 1
x+ 1
3. Simplify1
x
(1
x+ 1− 2
x+ 2
)Solution: First
1
x+ 1− 2
x+ 2=
1(x+ 2)− 2(x+ 1)
(x+ 1)(x+ 2)=
−x(x+ 1)(x+ 2)
⇒ 1
x
(1
x+ 1− 2
x+ 2
)=
1
x
(−x
(x+ 1)(x+ 2)
)=
−1
(x+ 1)(x+ 2)
21.4 Solving quadratic equations
A quadratic equation is an equation of the form ax2 + bx + c = 0, or x2 + bx + c = 0 if all coefficientsare divided by a. Here a, b and c are given. Quite often you will be required to solve equations of thistype, which means we try to determine x. There are many different ways of finding the solution. A feware listed here. They are completing the square, using a formula, or factorizing.
21.4.1 Completing the square
We may solve x2+bx+c = 0 by first writing the quadratic polynomial x2+bx+c in the form (x+d)2+e,and then solving (x+ d)2 + e = 0.
Applying the formula from example 21.2.2.(2),
x2 + bx = (x+ b2)2 − ( b2)2 (check!)
⇒ x2 + bx+ c = (x+ b2)2 − ( b2)2 + c = (x+ d)2 + e, with d = b
2 and e = c− ( b2)2 = c− d2.
To solve: (x+ d)2 + e = 0⇔ (x+ d)2 = −e⇔ x+ d = ±√−e⇔ x = −d−
√−e or x = −d+
√−e.
21.4.2 Examples
1. Express x2 + 6x+ 5 = (x+ d)2 + e. Then solve the equation x2 + 6x+ 5 = 0.
x2 + 6x+ 5 = (x+ 3)2 − 32 + 5 = (x+ 3)2 − 4.We solve x2 + 6x+ 5 = (x+ 3)2 − 4 = 0⇔ (x+ 3)2 = 4⇔ x+ 3 = ±2⇔ x = −5 or x = −1.
2. The method of completing the square can also be used to find the turning point of a parabola.Express f(x) = x2 + 4x+ 7 = (x+ d)2 + e. Find the turning point of this parabola.
Now b = 4 so d = 2.Also c = 7 so e = 7− 4 = 3. Thusx2 + 4x+ 7 = (x+ 2)2 + 3.The turning point of a parabola f(x) = (x+ d)2 + e is at (−d, e).In this case the turning point is a minimum located at (−2, 3).
21.4 Solving quadratic equations 235
21.4.3 Formula (Solution to quadratic equation)
There is also a formula you can use to solve a quadratic equation.Let a, b, c ∈ R.The solution of the equation
ax2 + bx+ c = 0
is
x =−b±
√b2 − 4ac
2a.
The solution of the equationx2 + bx+ c = 0
is
x = − b2±√b2
4− c.
21.4.4 Factorizing
Another way of solving quadratic (and higher order polynomial) equations is by factorizing. We tryto write the polynomial as a product of sums. You can sometimes use the formulas from Examples21.2.2.(2),(4) or (5) for this. Factorizing is also used to simplify expressions.
If you cannot use one of the formulas from Examples 21.2.2.(2),(4) or (5), you can sometimes factorizethe following way. We would like to express x2 + bx + c = (x + p)(x + q). Multiplying out we havex2 + bx+ c = (x+ p)(x+ q) = x2 + (p+ q) + pq. We now need to find p and q such that b = p+ q andc = pq. Have a look at these examples.
21.4.5 Examples
1. Factorize f(x) = x2 + 6x+ 5 using the formula from Example 21.2.2.(2). Then find the zeros.
x2 + 6x+ 5 = (x+ 3)2 − 22 = ([x+ 3]− 2)([x+ 3] + 2) = (x+ 1)(x+ 5).The zeros can be read directly from this expression:(x+ 1)(x+ 5) = 0⇒ x = −1 or x = −5.
2. Factorize x2 + 3x+ 2 without using the formula from Example 21.2.2.(2).
x2 + 3x + 2 = x2 + bx + c. We must find p and q such that b = 3 = p + q and c = 2 = pq.A solution is p = 1 and q = 2, so x2 + 3x+ 2 = (x+ 1)(x+ 2).
3. Factorize x2 − 7x+ 6.
Now −7 = p+ q and 6 = pq. A solution is p = −6 and q = −1, so x2 − 7x+ 6 = (x− 6)(x− 1).
21.4.6 Factorizing using common factors
The following formula uses common factors to factorize :
a(x + d) + b(x + d) = (a+ b)(x + d).
21.5 Surds 236
21.4.7 Examples
1. Factorize x(x+ 3) + 2(x+ 3).x(x+ 3) + 2(x+ 3) = (x+ 2)(x+ 3)setting a = x, b = 2, and d = 3 in the above formula.
2. Factorize (x+ 4)(x+ 3) + 2(x+ 3).(x+ 4)(x+ 3) + 2(x+ 3) = ([x+ 4] + 2)(x+ 3)setting a = [x+ 4], b = 2, and d = 3.
21.5 Surds
The following formula is often used to “simplify” surds:
√a−√b =
a− b√a+√b.
To see this
√a−√b =
(√a−√b) √a+
√b
√a+√b
=(√a−√b)(√a+√b)
√a+√b
=(√a)2 − (
√b)2
√a+√b
=a− b√a+√b.
Similarly,
√a+√b =
a− b√a−√b
21.5.1 Example
Simplify √2x+ 1−
√x+ 1
x, where x 6= 0
First
√2x+ 1−
√x+ 1 =
(√2x+ 1
)2 − (√x+ 1)2
√2x+ 1 +
√x+ 1
=x√
2x+ 1 +√x+ 1
⇒√
2x+ 1−√x+ 1
x=
1
x
(x√
2x+ 1 +√x+ 1
), where x 6= 0
=1√
2x+ 1 +√x+ 1
, where x 6= 0