Lecture 2 Linear Transformations

October 13, 2010

LINEAR TRANSFORMATIONS

RODICA D. COSTIN

Contents

2. Linear Transformations 12.1. Definition and examples 12.2. The matrix of a linear transformation 22.3. Null space and range 52.4. Dimensionality properties 62.5. Exercises 72.6. Invertible transformations , Isomorphisms 72.7. Change of basis, similar matrices 82.8. Examples. 112.9. Summary of some properties of n dimensional vector spaces 132.10. Direct sum of subspaces 142.11. Left inverse, right inverse of functions and of matrices 14

2. Linear Transformations

2.1. Definition and examples. Let U, V be two vector spaces over thesame field F (which for us is R or C).

Definition 1. A linear transformation T : U → V is a function which islinear, in the sense that it satisfies

(1) T (u + v) = T (u) + T (v) for all u,v ∈ U

and

(2) T (cu) = cT (u) for all u ∈ U, c ∈ F

The properties (1), (2) which express linearity, are often written in oneformula as

T (cu + dv) = cT (u) + dT (v) for all u,v ∈ U, c, d ∈ F

Note that if T is linear, then also

T

(r∑

k=1

ckuk

)=

r∑k=1

ckT (uk) for all u1, . . . ,ur ∈ U, c1, . . . cr ∈ F

1

2 RODICA D. COSTIN

Note that a linear transformation takes the zero vector to the zero vector.(Indeed, take c = 0 in (2).)

Linear transformations are also called linear maps, or linear operators.Examples.1. The scaling transformation T : Rn → Rn given by T (v) = 2v is linear.The general scaling transformation is linear: for some fixed scalar λ ∈ R,

let T (v) = λv. In particular, T (v) = 0 for all v is linear.And even more generally, dilations: T : Rn → Rn, T (x1, x2, . . . , xn) =

(λ1x1, λ2x2, . . . , λnxn) are linear.The rotation of the plane by θ (rad) counterclockwise: R : C→ C, R(z) =

eiθz is linear (over F = C).2. The projection operator P : R3 → R2 defined by P (x1, x2, x3) =

(x1, x2) is linear.3. The translation transformation T : Rn → Rn, given by T (v) = v + e

(where e ∈ Rn) is not linear.4. A few examples in infinite dimensions. Denote by P the vector space

of polynomials with coefficients in F .o) Evaluation at some t0: the operator Et0 : P → F defined by Et0(p) =

p(t0) is linear.a) The differentiation operator: D : P → P, defined by (Dp)(t) = p′(t)

is linear.b) The integration operator: J : P → P, defined by (Jp)(t) =

∫ t0 p(s)ds

is linear, and

c) so is I : P → F , by Ip =∫ 10 p(s)ds. Note that I is the composition

I = E1 ◦ J .

Remarks. The sum of two linear transformations S, T : U → V , definedlike for any functions, as (S + T )(u) = S(u) + T (u) is also a linear trans-formation, and so is multiplication by scalars (defined as (cT )(u) = cT (u)).In fact, the set of all linear transformations from U to V , denoted L(U, V )is a vector space.

Note that composition of two linear transformations is also a linear trans-formation.

2.2. The matrix of a linear transformation.A linear transformation is completely determined by its action on a basis.Indeed, let T : U → V be a linear transformation between two finite

dimensional vector spaces U and V . Let BU = {u1, . . . ,un} be a basis ofU . Then any x ∈ U can be uniquely written as

(3) x = c1u1 + . . .+ cnun for some c1, . . . , cn ∈ Fand by the linearity of T we have

(4) Tx = T (c1u1 + . . .+ cnun) = c1T (u1) + . . .+ cnT (un)

hence, one we give the vectors T (u1), . . . , T (un) then the action of T isdetermined on all the vectors in U .

LINEAR TRANSFORMATIONS 3

To determine T (u1), . . . , T (un) we need to specify the representation ofthese vectors in a given basis of V . Let then BV = {v1, . . . ,vm} be a basisof V , and denote by Aij the coefficients of these representations:

T (uj) =m∑i=1

Aijvi, for some Aij ∈ F, for all j = 1, . . . , n

The matrix [Aij ]i=1,...m,j=1,...n is called the matrix representation of

the linear transformation T in the basis BU and BV .Note that the matrix representation of a linear transformation depends

not only on the basis chosen for U and V , but also on the order on whichthe vectors in each basis are enumerated. It would have been more preciseto write the two basis BU , BV as a multiplet, rather than a set.

Example 1. Let T : C2 → C (note that here the scalars are F = C, andn = 2, m = 1). Consider the standard basis {e1, e2} for C2, where

(5) e1 =

[10

], e2 =

[01

]and the basis {f1} for C, where f1 = [1].

For any z ∈ C2 we have

z ≡[z1z2

]= z1

[10

]+ z2

[01

]= z1e1 + z2e2

Define T on the basis of its domain by: T (e1) = f1, T (e2) = −f1. Thensince T is linear, its action on any element of C2 is determined:

T (z) = T (z1e1 + z2e2) = z1T (e1) + z2T (e2) = z1[1] + z2[−1] = (z1 − z2)[1]

The same calculation in matrix notation: [Aij ]i=1;j=1,2 = [1,−1] is thematrix of T in the two basis, and T (z) equals its matrix A multiplied by thecoefficients of z in this basis (written as a vertical string).

Example 2. Consider the standard basis (5) for R2 and the standardbasis for R3:

(6) f1 =

100

, f2 =

010

, f3 =

001

Define a linear transformation T : R2 → R3 by T (ej) = A1jf1 + A2jf2 +

A3jf3 for j = 1, 2 (whereAij are given real numbers). Note that [Aij ]i=1,2,3;j=1,2

is the matrix representation of the linear transformation T in the two stan-dard basis.

Then for any x ∈ R2 the value of T (x) is determined, since

x =

[x1x2

]= x1

[10

]+ x2

[01

]= x1e1 + x2e2

4 RODICA D. COSTIN

therefore, by the linearity of L:

T (x) = T (x1e1 + x2e2) = x1T (e1) + x2T (e2)

= x1 (A11f1 +A21f2 +A31f3) + x2 (A12f1 +A22f2 +A32f3)

= (x1A11 + x2A12)f1 + (x1A21 + x2A22)f2 + (x1A31 + x2A32)f3 = Ax

and therefore

T (x) = Ax

where A is the matrix representation of T and Ax is the matrix multiplica-tion of the matrix A = [Aij ] with the column vector x.

In general, let U = Rn with the standard basis {e1, . . . , en} where

(7) e1 =

10...0

, e2 =

01...0

, . . . en =

00...1

and let V = Rm with the standard basis {f1, . . . , fm} (the vectors fj aredefined as in (7), only that ei have n components, while fj have m compo-nents). Note that the matrix of a linear transformation T in the standardbasis has as columns the vectors T (ei):

[T (e1), T (e2), . . . , T (en)]

and that

T (x) = Ax

where A is the matrix representation of T .

Conversely, let A be an m× n matrix. Then multiplication of vectors inRn by the matrix A

T : Rn → Rm, T (x) = Ax

is a linear transformation, whose matrix in the standard basis is exactly A.

Theorem 2. Let U, V, W be vector spaces over F , and fix some basisBU , BV , BW . Let T, S : U → V be linear transformations, with matrixrepresentations AT , AS in the basis BU,V .

(i) A linear combination cS+dT : U → V is a linear transformations forany scalars c, d ∈ F and its matrix representation is cAS + dAT in the basisBU,V .

(ii) Let R : V →W be linear, with matrix AR in the basis BV , BW . Thenthe composition R◦S : U →W , defined as (usually) by (R◦S)(x) = R(S(x))is a linear transformation, with matrix ARAS in the basis BU , BW .


The proof of the Theorem is immediate, and it is left as an exercise. Thelast fact, that composition of linear operators corresponds to multiplicationof the corresponding matrices is a straightforward calculation (this is whywe multiply matrices using that stange rule...). 2

2.3. Null space and range. The following definitions and properties arevalid in finite or infinite dimensions.

Definition 3. Let T : U → V be a linear transformation.The null space (or kernel) of T is

N (T ) = {u ∈ U ; T (u) = 0}

The range of T is

R(T ) = {v ∈ V, |v = T (u) for some u ∈ U}

Theorem 4. Let T : U → V be a linear transformation. Then:(i) T (0) = 0.(ii) N (T ) is a subspace of U .(iii) R(T ) is a subspace of V .(iv) T is one to one if and only if N (T ) = {0}.

The proof of (i)-(iii) is immediate. To show (iv) let x,y ∈ U so thatT (x) = T (y). Then this is equivalent to T (x)−T (y) = 0, which is equivalentto T (x − y) = 0 which means that x − y ∈ N (T ). With this observation,(iv) follows. 2

Important names: dimR(T ) is called the rank of T (denoted rank(T )),and dimN (T ) is called the nullity of T (denoted nullity(T )).

Remarks: Let A be an m × n matrix, and consider the linear transfor-mation T : Rn → Rm, defined by T (x) = Ax. Note the following relationsbetween systems of linear equations and the associated linear transforma-tion.

(0) The homogeneous system Ax = 0 has the set of solutions N (T ).(1) The system Ax = b is solvable (it has solutions) if and only if b ∈

R(T ).(2) Suppose the system Ax = b is solvable. Then its solution is unique if

and only if N (T ) = {0}.(3) If x0 is a particular solution of Ax = b, then the set of all solutions

is

x0 +N (T ) = {x0 + z | z ∈ N (T )}In other words:

General solution=Particular solution + Homogeneous solutionsIndeed, on one hand it is clear that all the vectors in x0 + N (T ) solve

the same equation: A(x0 + z) = Ax0 = b. Conversely, any two solutionsx1, x2 of the same equation: Ax1 = b, Ax2 = b, differ by an elementx1 − x2 ∈ N (T ).

6 RODICA D. COSTIN

(4) A solvable system Ax = b as a unique the solution if and only if thehomogeneous equation Ax = 0 has a unique solution (which is 0).

This is clear by (3) and property (iv) of Theorem 4.

2.4. Dimensionality properties.

2.4.1. Main Theorem.

Theorem 5. Let T : U → V be a linear transformation between two finitedimensional vector spaces U and V . Let BU = {u1, . . . ,un} be a basis of U .Then:

(i) R(T ) = Sp (T (u1), . . . , T (un)).(ii) T is one to one if and only if T (u1), . . . , T (un) are linearly indepen-

dent.(iii) dimR(T ) + dimN (T ) = dimU

(in other words, rank(T ) + nullity(T ) = n).

2.4.2. Remarks. 1) Property (iii) can be interpreted intuitively as: when theoperator T acts on n dimensions, some remain independent (their numberis dimR(T )), while the others are collapsed to zero (this is N (T )).

2) If the linear transformation T is multiplication by a matrix A, T :Rn → Rm given by T (x) = Ax:◦ property (i) states that the range of T is the column space of A,◦ property (ii) states that a solvable system Ax = b has a unique solutionif and only if the columns of A are independent.

2.4.3. Proof of Theorem 5. Note that (i) is immediately visible from (3), (4).Property (ii) is a rewording of Theorem 4 (iv): suppose that

∑cjT (uj) =

0 for some scalars cj . Since T is linear, this is equivalent to T (∑cjuj) = 0

so to∑cjuj ∈ N (T ), which gives (ii).

Property (iii) is proved s follows.Let v1, . . . ,vr be a basis for R(T ). Therefore, vj = T (uj) for some

uj ∈ U , for all j = 1, . . . , r.Now let z1, . . . zk be a basis for N (T ). Property (iii) is proved if we show

that B = {z1, . . . zk,u1, . . . ur} form a basis for U .To show linear independence, consider a linear combination

(8) c1z1 + . . .+ ckzk + d1u1 + . . .+ drur = 0

to which we apply T on both sides, and it gives

T (c1z1 + . . .+ ckzk + d1u1 + . . .+ drur) = T (0)

and by linearity

c1T (z1) + . . .+ ckT (zk) + d1T (u1) + . . .+ drT (ur) = 0

and since T (zj) = 0, and vj = T (uj)

d1v1 + . . .+ drvr = 0


which imply dj = 0 since v1, . . . ,vr are linearly independent. Then (8)becomes

c1z1 + . . .+ ckzk = 0

which implies cj = 0 since z1, . . . zk are linearly independent. In conclusionB is a linearly independent set.

To complete the argument we need to show that Sp(B) = U . Let u ∈ U .Then v = T (u) ∈ R(T ) therefore

v = d1v1 + . . .+ drvr

for some scalars d1, . . . dr. Note that

T (u− d1u1 + . . .+ drur) = T (u)− d1v1 + . . .+ drvr = 0

therefore u−d1u1 + . . .+drur ∈ N (T ), so it is a linear combination of zj .2

2.4.4. More consequences of the Dimensionality Theorem 5 (iii). Let U, V befinite dimensional vector spaces, and T : U → V be a linear transformation.

1. Suppose dimU = dimV . Then T is one to one ⇐⇒ T is onto

For systems this means that for a given square matrix A we have: thehomogeneous equation Ax = 0 has only the zero solution if and only if allAx = b are solvable.

Reformulated as: either all Ax = b are solvable, or dimN (T ) > 0, it isthe celebrated Fredholm’s alternative.

To prove 1., note that: T is one to one ⇔ dimN (T ) = 0 so byTheorem 5 (iii), ⇔ dimR(T ) = dimU therefore ⇔ dimR(T ) = dimVwhich means R(T ) = V so T is onto.

2. Suppose dimU > dimV . Then T cannot be one to one.

Indeed, dimN (T ) = dimU − dimR(T ) ≥ dimU − dimV > 0.

3. Suppose dimU < dimV . Then T cannot be onto.

Indeed, dimR(T ) = dimU − dimN (T ) ≤ dimU < dimV .

2.5. Exercises. ¿From Strang’s book, 3rd. edition, Review Exercises Chap-ter 2, problems

2.6. Invertible transformations , Isomorphisms.

2.6.1. The inverse function. Recall that if f : A → B is a function whichis onto to one and onto, then the function f has an inverse, denoted f−1

defined as f−1 : B → A by f(x) = y ⇔ x = f−1(y).Recall that the compositions f ◦ f−1 and f−1 ◦ f equal the identity func-

tions:

(f−1 ◦ f)(x) = f−1 (f(x)) = x, (f ◦ f−1)(y) = f(f−1(y)

)= y

8 RODICA D. COSTIN

Examples:1) let f : R→ R, f(x) = 5x. Then f−1 : R→ R, f−1(x) = x/5;2) let g : R→ R, f(x) = x+ 3. Then f−1 : R→ R, f−1(x) = x− 3;3) let h : (−∞, 0] → [0,∞) by h(x) = x2 then h−1 : [0,∞) → (−∞, 0],h−1(x) = −

√x.

2.6.2. The inverse of a linear transformation. Let T : U → V be a lineartransformation between two vector spaces U and V . If T is onto to one andonto, then the function T has an inverse T−1, T−1 : V → U .

Exercise. Show that the inverse of a linear transformation is also a lineartransformation.

Definition 6. A linear transformation T : U → V which is onto to oneand onto is called an isomorphism of vector spaces, and U and V are calledisomorphic vector spaces.

Whenever two isomorphic vector spaces are isomorphic, then any propertywhich can be written using the vector space operations that one space has,can be translated, using the isomorphism T , into a similar property for theother space.

The following theorem shown that all the finite dimensional vector spacesare essentially Rn or Cn:

Theorem 7.Any vector space over R of dimension n is isomorphic to Rn.Any vector space over C of dimension n is isomorphic to Cn.

Indeed, let U be a vector space of dimension n over F (where F is R orC). Let u1, . . . ,un be a basis for U . Define T : U → Fn by T (uj) = ej forj = 1, . . . n and then extended by linearity to all the vectors in U . ClearlyT is onto, therefore it is also one to one, hence it is an isomorphism. 2

Example.1. As a vector space, Pn is essentially Rn+1.

2.7. Change of basis, similar matrices. Let T : U → V be a lineartransformation between two finite dimensional vector spaces U and V . LetBU = {u1, . . . ,un} be a basis of U , and BV = {v1, . . . ,vm} be a basisof V . Recall (see §2.2) that T is completely determined by the vectorsT (u1), . . . , T (un), and thse vectors, in turn, are completely determined bytheir expansion in the basis of V :

T (uj) =m∑i=1

Aijvi, for some Aij ∈ F, for all j = 1, . . . , n

and the matrix [Aij ]i=1,...m,j=1,...n is the matrix representation of the linear

transformation T in the basis BU and BV .Example. Consider the identity transformation I : U → U, I(x) = x.

Then its matrix representation in the basis BU , BU is the identity matrix I


(the diagonal matrix with 1 on its diagonal). The identity transformationis often denoted simply I.

2.7.1. The matrix representation of a composition is the product of the ma-trix representations. We stated without proof:

Proposition 8. Let U, V,W be vector spaces over F (the same for all threespaces). Let BU = {u1, . . . ,un} be a basis of U , BV = {v1, . . . ,vm} be abasis of V and BW = {w1, . . . ,wp} be a basis of W .

Let T : U → V and S : V →W be linear transformations. Denote by ATthe matrix representation of T in the basis BU , BV , and by AS the matrixrepresentation of S in the basis BV , BW .

UBU

T−→ VBV

S−→ WBW

and by composition

UBU

S◦T−→ WBW

Then the matrix representation of S ◦ T : U →W in the basis BU , BW isthe product of the two matrices ASAT : AS◦T = ASAT .

Proof. We have

(9) T (uj) =

m∑i=1

AT,ijvi for all j = 1, . . . , n

and

S(vj) =m∑k=1

AS,kjwk, for all j = 1, . . . ,m

Therefore

(S ◦ T )(uj) = S (T (uj)) = S

(m∑i=1

AT,ijvi

)=

m∑i=1

AT,ij S(vi)

=m∑i=1

AT,ij

(m∑k=1

AS,kiwk

)=

m∑k=1

(m∑i=1

AT,ij AS,ki

)wk

and noting that the coefficient of wk is the (k, j) entry of the matrix ASATthe proof is complete. 2

Recall the formula for the product of two matrices A and B: the elementof AB in the position kj is obtained by multiplying elements on line k of Aby elements on column j of B, hence (AB)kj =

∑iAkiBij .

Exercise. Show that if T : U → V is invertible (note that necessarily Uand V must have the same dimension), and if AT is its matrix representation

10 RODICA D. COSTIN

in the basis BU , BV then the matrix representation of T−1 in the basisBV , BU is the inverse of the matrix AT : AT−1 = A−1T .

2.7.2. Question: How does the matrix of T change when the basis BV ischanged?

Suppose we want the change the basis BV = {v1, . . . ,vm} to anotherbasis B′V = {v′1, . . . ,v′m}. Each vi can be expanded in the basis B′V :

(10) vi =

m∑k=1

Skiv′k, i = 1, . . . ,m for some Ski ∈ F

therefore

T (uj) =m∑i=1

AT,i,jvi =m∑k=1

(m∑i=1

SkiAT,i,j

)v′k =

m∑k=1

(SAT )k,j v′k

so the matrix representation of T in the basis BU , B′V is SAT .

Remark. The matrix S is called the change of basis matrix from BV toB′V . Relation (10) shows that the S is the matrix representation of theidentity transformation I : V → V in the basis BV , B

′V :

UBU

T−→ VBV

I−→ VB′V

and by composition

UBU

I◦T−→ VB′V

Since I ◦ T = T then the associated matrix representations satisfy SAT =the matrix associated to T in the basis BU , B

′V .

2.7.3. Question: How does the matrix of T change when the basis BU ischanged?

Suppose we want the change the basis BU = {u1, . . . ,un} to another basisB′U = {u′1, . . . ,u′n}. Let R be the matrix of change of basis from BU to B′U .

We can either do a direct calculation with matrices, or use compositions,see the Remark below.

For a direct calculation, using the inverse of R to express B′U in terms ofBU we have

(11) u′j =m∑k=1

R−1kj uk, i = 1, . . . ,m

therefore

T (u′j) = T

(m∑k=1

R−1kj uk

)=

m∑k=1

R−1kj T (uk)


and by (9)

=m∑k=1

R−1kj

n∑i=1

AT,ikvi =n∑i=1

(m∑k=1

R−1kj AT,ik

)vi =

n∑i=1

(ATR

−1)ij

vi

which means that the matrix representation of T in the basis B′U , BV isATR

−1.Remark. A simpler derivation using compositions is a follows.The matrix R is the change of basis matrix from BU to B′U , namely the

matrix representation of the identity transformation I : U → U in the basesBU , B

′U :

UBU

I−→ UB′U

and relation (11) shows that the R−1 is the matrix representation of itsinverse (still the identity, but in bases B′U , BU ). Consider the composition

UB′U

I−→ UBU

T−→ VBV

so

UB′U

T◦I−→ VBV

Since T ◦I = T then the associated matrix representations satisfy ATR−1 =

the matrix associated to T in the basis B′U , BV .

2.7.4. Similar matrices. In the particular case when T is a linear transfor-mation from U to itself: T : U → U , T is called an endomorphism of U . LetAT be the matrix representation of T in the basis BU of U (the same basisfor the domain and the codomain). Suppose we want to change the basis toB′U . Combining §2.7.2 and §2.7.3 we see that the matrix representation ofT in the basis B′U is SATS

−1.

Definition 9. Two square matrices A,B are called similar if there existsan invertible matrix S so that B = SAS−1.

As seen above, similar matrices represent the same linear transformation(endomorphism), but in different basis.

Notations. For linear transformations it is customary to simplify thenotations: instead of T (v) to write Tv, and instead of S ◦T to simply writeST .

We will next see different examples: projections, symmetries, rotations.

2.8. Examples.

12 RODICA D. COSTIN

2.8.1. Projections. An example of (orthogonal) projection in the x1x2-plane,is the projection on the x1 line. It is defined as P0 : R2 → R2 by P0(x1, x2) =(x1, 0). This is a linear transformation (check!) and to find its matrix in thestandard basis note that P0(e1) = e1 and P0(e2) = 0 therefore

AP0 =

[1 00 0

]and P0(x) = AP0x (matrix AP0 times vector x).

As an example of (orthogonal) projection in space, let Q : R3 → R3 bethe projection on the x1x2-plane: Q(x1, x2, x2) = (x1, x2, 0).

Exercise. Check that P 2 = P and Q2 = Q (recall that by P 2 we meanP ◦ P ). Find the matrix of Q in the standard basis.

2.8.2. Rotations. Consider the transformation of the plane which rotatespoints by and angle θ counterclockwise around the origin. The simplest wayto deduce the formula for this transformation is to consider the plane as thecomplex plane C, where rotation by angle θ is achieved by multiplicationwith exp(iθ): let L : C→ C, defined by L(z) = eiθz.

Now we only need to convert to real coordinates: if z = x1 + ix2 then

eiθz = (cos θ + i sin θ)(x1 + ix2) = (cos θx1 − sin θx2) + i(sin θx1 + cos θx2)

therefore define the rotation Rθ : R2 → R2 by Rθ(x1, x2) = ((cos θx1 −sin θx2), (sin θx1 + cos θx2)) whose matrix in the standard basis is (check!)

ARθ =

[cos θ − sin θsin θ cos θ

]and the rotation is the map

R

([x1x2

])=

[cos θ − sin θsin θ cos θ

] [x1x2

]Remarks:

(i) Rθ1Rθ2 = Rθ1+θ2 for any θ1,2(ii) Rθ is invertible, and (Rθ)

−1 = R−θ.(iii) detARθ = 1.

2.8.3. Projections on other lines in R2. Consider the orthogonal projectionPθ : R2 → R2 on the line x2 = tan θ x1. We can obtain its formula using P0

and changes of coordinates.We rotate the x-plane by an angle −θ, thus taking the projection line on

the x1-axis, then project using P0, and then rotate back. We obtain theprojection Pθ as the composition: Pθ = RθP0R−θ.

Alternatively, we can think of this composition in terms of changes ofcoordinates. Let e′1 = Rθe1 and e′2 = Rθe2 (then e′1 is a unit vector on theline x2 = tan θ x1). In coordinates e′1, e′2 the projection Pθ has the simpleformula Pθ(c1e

′1+c2e

′2) = c1e

′1 having the matrix AP0 (as it is clear from the

geometry of the problem). We now need to write Pθ in standard coordinatese1, e2:


R2

e1,2

I−→ R2

e′1,2

Pθ−→ R2

e′1,2

I−→ R2

e1,2

hence the matrix of Pθ in the standard basis of R2 is APθ = ARθAP0AR−θ .

2.8.4. Reflexions. The reflexion of the plane about its x1-axis: T0 : R2 → R2

is obviously given by T0(x1, x2) = (x1,−x2), and its matrix representationin the standard basis is

AT0 =

[1 00 −1

]Now suppose we want to write the formula for the reflexion Tθ : R2 → R2

about a line of slope θ (with equation x2 = tan θ x1. We can obtain hisby composition of the simpler transformations we wrote before as follows:rotate the domain of Tθ by an angle −θ; this will take the line into thex1-axis, we then apply the reflexion T , then we go back to the originalcoordinates by a rotation of angle θ. This gives Tθ = RθT0R−θ = RθT0R

−1θ .

Exercise. 1) Write the matrix associated to Tθ in the canonical basis.2) Use the result above to write the matrix associated to a reflexion aboutthe x2 axis in the canonical basis and check the results using geometricalarguments.

Exercise. Use composition of transformations to find the formula forthe orthogonal projection of the plane onto a line x1 = mx2 and check yourresult geometrically.

Also look over the problems in Strang’s book, Sec. 2.6, Linear Transfor-mations.

2.9. Summary of some properties of n dimensional vector spaces.The following hold for an n-dimensional vector space V . (We assume n > 0.)

Theorem. Any linearly independent set of vectors can be completed toa basis.

As a consequence of this theorem (proved earlier):• Any set of n+ 1 or more vectors in V are linearly dependent.• Any set of n linearly independent vectors in V is a basis for V .Theorem. Any spanning set of V contains a basis of V .Proof. Consider a spanning set: Sp(v1, . . . ,vp) = V .Step 1. If 0 is among the vectors vj , then we remove it, and the remaining

set is still a spanning set.Step 2. If v1, . . . ,vp ∈ V are linearly independent, then they form a basis

of V , and the claim is proved.Otherwise, there exists a linear relation c1v1 + . . . + cpvp = 0 with not

all scalars cj equal to zero. Say cp 6= 0. Then vp is a linear combination ofv1, . . . ,vp−1 and therefore Sp(v1, . . . ,vp) = Sp(v1, . . . ,vp−1) = V , and wenow have a spanning set with p− 1 element.

Continue with Step 1. The procedure ends with a basis of V . (It cannotend with the zero vector, since this means that dimV = 0.) 2

14 RODICA D. COSTIN

As a consequence of this theorem:• Any set of n vectors in V which span V is a basis for V .• Any spanning set for V must contain at least n vectors.

2.10. Direct sum of subspaces. Let V be a vector space over the field F(which for us is either R or C).

Recall that if U,W are two subspaces of V then their sum is defined as

U +W = {u + w |u ∈ U, w ∈W}and that U +W is also a subspace of V .

2.10.1. Examples. Let V = R3.1) If U and W are two distinct planes (through O) then U +W = R3.2) If U and W are two distinct lines (through O) then U +W = their plane.3) If U is a line and W is a plane (through O) then U + W = the wholespace if U 6⊂W , and U +W = W is U ⊂W .

If U and W are

Definition 10. If U ∩W = {0} then their sum U +W is called the directsum of the subspaces U and W , denoted by U

⊕W .

In the examples of §2.10.1, the sum in 1) is not a direct sum (the inter-section of two planes is a line), but the sums in 2) and 3) are direct sums.

Theorem 11. Let U,W are two subspaces of V with U ∩W = {0}. Thenif u1, . . . ,un is a basis of U , and v1, . . . ,vm is a basis of V thenu1, . . . ,un,v1, . . . ,vm is a basis of U

⊕W .

In particular, dimU⊕W = dimU + dimW .

The proof is left as an exercise.Remark: more is true dimU +W = dimU + dimW − dimU ∩W .

Theorem 12. Existence of the complement space. If U is a subspaceof V , then there exists W a subspace of V so that U

⊕W = V

The proof is left as an exercise.Examples. Let U be the x1-axis in V = R2 (the x1x2-plane). Any different

line W (through O) is a complement of U . (Why?)Let U be the x1-axis in V = R3. Any plane (through O) not containing

W is a complement of U . (Why?)

2.11. Left inverse, right inverse of functions and of matrices.

2.11.1. Inverses for functions. Let f : D → F be a function defined on aset D (the domain of F ) with values in a set F (the codomain of f).

If f is one to one and onto, then there exists the inverse function f−1 :F → D, which satisfies:(i) f−1(f(x)) = x for all x ∈ D, in other words, f−1 ◦ f = Id where Id isthe identity function of D (Id : D → D, Id(x) = x), and(i) f(f−1(x)) = x for all x ∈ F , in other words, f ◦ f−1 = Id where Id is


the identity function of F (Id : F → F, Id(x) = x).(We should normally write IdD, IdF for the identity of D, respectively, F .)

In other words, f−1 is the inverse of f with respect to composition. Com-position of functions is not commutative (f ◦g 6= g◦f) so in order to concludethat f−1 is the inverse of F then both relations f−1 ◦ f = Id, f ◦ f−1 = Idmust be satisfied.

If f−1 ◦ f = Id then we say that f−1 is a left inverse of f .If f ◦ f−1 = Id then we say that f−1 is a right inverse of f .It is not hard to see that if f in one to one, then f has a left inverse,

while if f is onto, than it has a right inverse. But we will restrict theseconsiderations to linear transformations only.

2.11.2. Left inverse. Let T : U → V be a linear transformation.Question: when can we find a left inverse for T that is, a linear transfor-

mation L : V → U so that L ◦ T = I?We must have L(T (u)) = u for all u ∈ U . This is clearly possible only

if T is one to one, since suppose that for some u we have T (u) = 0. Thenu = L(T (u)) = L(0) = 0 so u = 0 and T is one to one.

Let T be one to one. If it is also onto, then we can take L to be theinverse of T , and th problem is solved.

But we can construct a left inverse even if T is not onto. Indeed, let Wbe a complement of R(T ): write V = R(T )

⊕W . Let v1, . . . ,vn be a basis

of R(V ) (where each vj has the form vj = T (uj) for a unique uj), andw1, . . . ,wp be a basis of W . Then v1, . . . ,vn,w1, . . . ,wp is a basis of V . Todefine L we must give its values on a basis of V . We must define L(vj) = ujfor j = 1, . . . , n, and L(wj) can be chosen any vector in U . In conclusion:

Theorem 13. One to one transformations T do have left inverses L. Fur-thermore, decomposing V = R(T )

⊕W , then L restricted to R(T ) is the

inverse of T when considered with the restricted codomain T : U → R(T ).

To formulate this result for matrices, let A be an m × n matrix, andconsider T : Rn → Rm defined by T (x) = Ax. The assumption that T isone to one means that n ≤ m and rankA = n (by Theorem 5 (iii)), therefore:

Theorem 14. An m × n matrix with n ≤ m and maximal rank has a leftinverse: an n×m matrix L so that LA = I.

If n = m then L = A−1, while if n 6= m then L is not unique.

2.11.3. Right inverse. Let T : U → V be a linear transformation.Question: when can we find a right inverse for T that is, a linear trans-

formation R : V → U so that T ◦R = I?We must have T (R(v)) = v for all v ∈ V . This implies that R(T ) = V ,

hence T is onto.Let T be onto. If T is also one to one, then we can take R to be the

inverse of T . But we can construct R even if T is not one to one as follows.Let v1, . . . ,vm be a basis for V . Since T is onto, then each vj has the form

16 RODICA D. COSTIN

vj = T (uj) for some uj ∈ U . Note that there are many possible choices foruj since vj = T (x) for any x ∈ uj +N (T )!

For a choice of uj we can define R: we must set R(vj) = uj , as this satis-fies T (R(vj)) = T (uj) = vj which extended by linearity gives T (R(v)) = vfor all v ∈ V . We proved:

Theorem 15. Let T : U → V be a linear transformation which is onto.Then T has a right inverse: there exists R : V → U so that T ◦R = I.

And in the language of matrices: let A be an m×n matrix, and considerT : Rn → Rm defined by T (x) = Ax. The assumption that T is ontomeans that n ≥ m and that Sp(Ae1, . . . , Aen) = Rm, which means thatrankA = m (this follows from your previous knowledge of linear algebra,or from this course later on, when we refer to the transpose/adjoint of amatrix).

Theorem 16. Let A be an n ×m matrix with n ≥ m. If A has maximalrank, then A has a right inverse: an m× n matrix R so that AR = I.

Documents

Lecture 2 Linear Transformations