Art Penrose

7/26/2019 Art Penrose

1/9

Mathematical Proceedings of the CambridgePhilosophical Societyhttp://journals.cambridge.org/PSP

Additional services for Mathematical Proceedings of theCambridge Philosophical Society:

Email alerts: Click here

Subscriptions: Click hereCommercial reprints: Click hereTerms of use : Click here

A generalized inverse for matrices

R. Penrose

Mathematical Proceedings of the Cambridge Philosophical Society / Volume 51 / Issue 03 / July 1955, pp 406- 413DOI: 10.1017/S0305004100030401, Published online: 24 October 2008

Link to this article: http://journals.cambridge.org/abstract_S0305004100030401

How to cite this article:R. Penrose (1955). A generalized inverse for matrices. Mathematical Proceedings of theCambridge Philosophical Society, 51, pp 406-413 doi:10.1017/S0305004100030401

Request Permissions : Click here

Downloaded from http://journals.cambridge.org/PSP, IP address: 148.228.124.31 on 08 Jun 2016


2/9

[ 4 6

A GENERALIZED INVERSE FOR MATRICES

B Y R. PENROSE

Communicated by

J .

A. T O D D

Received26 July 1954

This paper describes

a

generalization of the inverse of

a

non-singular m atrix, as the

unique solution ofa certain set of equations. This generalized inverse exists for any

(possibly rectangu lar) m atr ix w hatsoever with complex elements J. I t is used here for

solving linear m atr ix eq ua tion s, an d among other applications for finding an expression

for the p rincipal idem po ten t elem ents of a m atrix. Also a new typ e of spectral decom-

position is given.

In another paper itsapplicat ion to substitutional equations and thevalueof

hermitian idempotents will be discussed.

Notation. Ca pital letter s alw ays deno te matrices (not necessarily square) with com-

plex elements. The conjugate transpose of the matrixAis writte n A*. Small letters

are used for column vectors (with an asterisk for row vectors) and small Greek letters

for complex numbers.

The following properties of the conjugate transpose will be used:

A** =A,

(A+B)* = A*+B*,

(BA)* = A*B*,

AA* =0implies

4

= 0.

Th e last of these follows from th e fact th a t the trace ofAA* is th e sum of the squ ares

of the moduli of the elements of

A.

Erom the last two we obta in the rule

BAA* = CAA* implies BA =G A, (1)

since (BAA*-CAA*)(B-C)* =(BA-CA)(BA-GA)*.

Similarly BA*A = CA*A implies BA* =GA*. (2)

T H E O R E M \.The four equations AX A A C\\

XAX =X, (4)

{AX)*

=AX, (5)

{XA)*

=XA, (6)

have a unique solution for any

A.

%

Matrices over more general rings will be considered in

a

later paper.


3/9

A generalized inverse

for

matrices

407

Proof. Ifirst show that equations(4) and(5)areequivalentto thesingle equation

XX*A* =

X.

(7)

Equation (7)follows from (4) and(5), sinceit ismerely (5)substituted in(4). Con-

versely, (7)implies

AXX*A*

=

AX,

th eleft-hand sideofwhichishermitian. Thus

(5) follows, and substituting (5)

in

(7) we get (4). Similarly, (3)

and

(6) can be replaced

bytheequation XAA*

= A*.

(8)

Thusitis sufficient tofindan

X

satisfying (7)and(8). Suchan

X

will existif a

B

can

be found satisfying

M

, ,

3 B

BA*AA* = A*.

For then

X

=

BA*

satisfies (8).Also,wehave seen that (8)implies

A*X*A*

=

A*

and therefore

BA*X*A*

=

BA*.

Thus

X

also satisfies(7).

Nowtheexpressions

A*A,

(A*A)

2

, (A*A)

3

,... cannotall be linearly independent,

i.e. there existsarelation

A

1

A*A +\

2

(A*A)*+...+A

k

(A*A)

k

=0

>

(9)

where A,, ..., A

ft

are not allzero. Let

A,

bethefirst non-zero

A

and pu t

=

V

Thus

(9)

gives

B(A*A)

r+1

=

(A*A)

T

,

and

applying

(1) and (2)

repeatedly

we

obtain

BA*AA*

=

A*,

asrequired.

To show th at

X

is unique, we suppose that

X

satisfies (7)and(8) and tha t

Y

satisfies

Y

=

A*Y*Y

and

A* =A*AY.

These last relations are obtained by respectively

substituting (6)in(4)and(5)in

(3).

(They are (7)and(8) with

Y

in place of X and the

reverse order

of

multiplication and must,

by

symmetry, also

be

equivalent

to

(3),

(4),

(5)and(6).)Now

X = XX*A*

=

XX*A*AY = XAY

=

XAA*Y*Y = A*Y*Y

=

Y.

The unique solutionof

(3),

(4), (5)and(6) will be called the generalized inverse of

A

(abbreviated g.i.) and written

X =A* .

(Note that

A

need not beasquare matrixand

may evenbezero.) Ishall also usethenotation

A*

forscalars, where

A

+

means A

1

if

A

+OandOif

A

= 0.

In the calculation of

A*

it

is

only necessary

to

solve the

two

unilateral linear equations

XAA* =

A*

and

A*A Y = A*.

Byputting

A*

=

XA Y

andusingth efact that

XA

and

A Y

arehermitian andsatisfy

AX A = A

=

A YA

we observe that the four

relations

AA^A

=

A,

A'

t

AA

i

=

A*,

(AA


4/9

4 0 8 R . P EN R O S E

Note that the properties

(A+B)* = A* +B*

and

(AA)*

=

A A*

were not used in

the proof of Theorem 1.

LEMMA.

1-1

A

n

= A;

1-2

A** = A**;

1-3 if A is

non-singular

A

f

= A-

1

;

1-4 (/U)

+

1-6

(A*Ay

=

1-6 ifUandV

are

un itary (UA V)

f

=

V*A

t

V*;

1-7 if A =~LA

it

where

A

i

A*

= 0 and ^ 4 * ^ = 0,

whenever

i+j then

A*

=-LAX;

1-8 i/

A is

normal

A

f

A

=

AA''

and

(A

n

)

f

= (

+

);

1-9 A, A*A,

A*

and A

i

A

all have rank equal to

traceA*A.

Proof. Since the g.i. of a matrix is always unique, relations 1-1,...,1-7 can be

verified by substituting the right-hand side of each in the denning relations for the

required g.i. in each case. For 1-5 we require (10) and for 1-7 we require the fact that

AiAlj = 0andA\A

}

=0ifi#j This follows fromA]=A?A}*A)andA\ = A\A\*A\.

The first p ar t of 1-8, of which the second pa rt is a consequence, follows from 1-5 and

(10), since

A*

A = (A*A)

f

A*A and AA

1

= (AA*)

f

AA*. (The condition for A to be

normal is

A*A =

AA*.)%To prove 1-9 we observe th at each of

A, A*A, A*

and

A

f

A

can be obtained from the others by pre- and post-multiplication by suitable matrices.

Thus their ranks must all be equal and

A''A

being idempotent has rank equal to its

trace.

Relation 1-5 provides a method of calculating the g.i. of a matrix A from the g.i.

of

A* A,

since

A*

=

(A*

Ay

A*.

Now

A* A,

being hermitian, canbereducedtodiagonal

form by a unitary transformation. HenceA*A = UDU*, where U is unitary and

D

= diag(a

1;

...,a

n

). Thus Z>

+

= diag(aj, . . . , < ) and (4*.4)

+

= UD*U*by 1-6.

An alterna tive proof of the existence of the g.i. of a square matrix can be obtained

from Autonne's

(1)

result that any square matrix

A

can be written in the form

VBW,

where Vand Ware unitary andBis diagonal . Now

B*

exists since it can be denned

similarly to D

f

above. It is thus sufficient to verify that A

f

=W*B*V* as in 1-6.

A rectangular m atrix can be trea ted by bordering it with zeros to make it square.

Notice thatA

f

is a continuous function ofAif the rank ofA is kept fixed , since in

th e singular case the polynomial in (9) could be taken to be the characteristic function

ofA*A. The value of

r,

being the nullity ofA *A(asA *Ais hermitian), is thus deter-

mined by the rank ofA (by 1-9). AlsoA

1

varies discontinuously when the rank ofA

changes since rankA = traceA*Aby 1-9.

J This result, like several of the earlier ones, is a simple consequence of the construction of

A* given in Theorem 1. It seems desirable, however, to show that it is also a direct consequence

of (3), (4), (5) and (6).

A simple proof of thi s resu lt is as follows: sinceAA* an d A*A are both hermitian and have

the same eigenvalues there exists a unitary matrix T such that

TAA*T*

= A*A. Hence TA is

nor ma l and therefore diagonable by uni tary transformation.


5/9

A generalized inverse for matrices

409

THEOREM 2.

A necessary and sufficient condition for the equation AXB = C to have

a solution is AA


6/9

410 R.

P E N R O S E

Proof.

The proofs of

2-1

an d 2-2 are eviden t. To prove

2-3

suppose

K

2

=K,

then, by

l-l,K = {(K*K) (KK')y.ThusE =KK'and F = Z'Zwilldo. Conversely,ifK = {FEf

t hen

K

is of the form

EFPEF

(sinceQ*

=

Q*(Q

f

*Q

f

Q

f

*)Q* by (10)). HenceK

=

EKF,

so tha t

K

2

= E(FEy FE{FEy

F

= E(FE)

f

F

=K.

Principal idempotent elementsofama trix.

For any square matrix

A

there exists

a unique set

of

m atricesK

x

denn ed for each complex num ber

A

such that

K

x

K^8

Xll

K

x

, (11)

ZK

X

=I,

(12)

AK

X

=K

X

A, (13)

(A-

A/)

K

x

is nilpotent, (14)

the non-zero

K

x

's

being the

principal idempotent

elements

of

A

(see Wedderburn(7))4

Unless Ais

an

eigenvalue ofA, K

x

is zero so that the sum

in

(12)isfinite, tho ug h

su m m ed over all complex A. Th e following theorem gives an explicit' formula forK

x

in terms ofg.i.'s.

T H B O B E M

3. / /

E

x

= I-{(A-U)

n

y(A-XI)

n

and F

x

= / -(A-U )

n

{ ( A - A/)

n

}

+

,

wheren issufficiently large (e.g. theorder of A), thenthe principal idempotent elements

ofA.are, given by K

x

= (F

x

E

x

y,andncanbetakenasunityifandonlyifAisdiagonable.

Proof.First, supposeA

is

diagonable and put

E

x

= I-(A-Aiy(A-M)

and F

x

=

I-(A-M)(A-Aiy.

ThenE

x

and F

x

are h .i. b y 2-1 and are zero unlessA is an eigenvalue ofA

by

1*3.

N o w

(AjuI)E =0 and F

x

{A-\I) =0, (15)

so tha t

[iF^p=F

x

AE

/t

=

A - F ^ . T hus

2 ^

= 0 if A*/*. (16)

Put t i ng

K

x

= (F

x

E

x

)

f

we have,by 2-3,

K

x

= E

x

(F

x

E

x

yF

x

. (17)

HenceK =S

X/l

K

x

b y (16). Also (17) gives

F^E, =

S^S^F.E^

(18)

Any eigenvectorz

a

oiAcorresponding to the eigenvalueasatisfies E

a

z

a

=z

a

. Since

A isdiagonable, an y colum n vecto rx conformable withA

is

expressible asasum of

eigenv ectors; i .e.itis expressible in the form

(a

finite sum over all com plex A)

x = ZE

x

x

x

.

| The existence of suchK

x

s can be established as follows:

let

4>(E)= (A^

1

.. .

( |

A^ '

be,

say,

the

minim um function

ofA

where

the

factors (

A

(

)

nl

are

mu tually coprime. Then

if

0()=(E,- A,)


7/9

A generalized inverse

for

matrices

411

Similarly,

if

y*

is

conformable with

A,i t

is expressible

as

Now 2/*(ig

x=

(Xy*

x

F

x

)

(S j

by (18) and (16). Hence

Eif^ = / .

Also, from (15) and (17) we obtain

AK

X

= XK

X

=

K

X

A. (19)

Thus conditions (13) and (14) are satisfied. Also

A

=

2AZ

A

.

(20)

Conversely,

if it is no t

assumed that

A is

diagonable but that

Y,K

X

= / ,K

x

being

defined as above (i.e. with

n= 1),

then

x=

HK

x

x gives

a;

as

a

sum of eigenvectors of

A,

since (19) was deduced without assuming the diagonability

ofA.

If

A

is not diagonable

i t

seems more convenient simply to prove tha t

for

any set of

K

x

s

satisfying (11), (12), (13) and (14) each

K

x

=

(F

x

E

x

y, whereF

x

and

E

x

are defined

as

in

the theorem.

Now,

if

the iC

A

's satisfy (11), (12), (13)

a nd

(14) they must certainly satisfy

and (A

- XI)

n

K

x

=

0 =

K

X

(A

-

A/) ,

where

n

is sufficiently large (ifnis the order

ofA,

this will do).

From this and the definitions of

E

x

andF

x

given in the theorem we obtain

E

X

K

X

= K

X

=K

X

F

X

. (21)

Using Euclid's algorithm we can find

P

and Q(polynomials

inA)

such that

I = (A

- M)

n

P

+Q(A - pi)*,

provided that A#/t. Now F

x

(A-XI)

n

= 0 =

(A-piy-E^. Hence F^E^

= 0 if

A=t=/t,

which with (21) givesF^K^

= 0 =

K^E whenever A#/t. But SZ

A

= / .

Thus

and K

X

E

X

= E

K

.\

( 2 2 )

Using (21)

and

(22)

i t is a

simple matter

to

verify that (-F

A

^

A

)

f

=

K

K

. (Note that

iT

A

=

/ - { ( / - #

A

) ( / - ^

A

) } t also.)

Incidentally, this provides

a

proof

of

the uniqueness

ofK

x

.%

COROLLARY.

/ /

A

is

normal,

it

isdiagonableand its principal idempotentelements

are

hermitian.

Proof.Since

A

XI

is

normal we

can

apply

1-8 to the

definitions

ofE

x

and

F

x

in

Theorem 3. This gives

E

X

= I-(A-XIY(A-XI) and F

x

=I -

(A

-

A/) (A

- XI)\

w h en ceA

is

diagonable. AlsoE

x

=

F

x

,so

t h a t K

x

=E

x

is

h e rmi t i an .

%

In

fact, S.K

A

=

I

and

(AXI)

n

K

x

= 0 are

sufficient

for

uniqueness.


8/9

412 R. PBNROSE

Tor this result see, for example, Drazin(4).

If A is normal (20) givesA =AJ?

A

, and thus using 1-7, 1-4 and 2-2 we obtain

A new type of

spectral

decomposition.UnlessA is normal, there does not appear to be

any simple expression forA*in terms of its principal idempotent elements. However,

this difficulty is overcome with the following decomposition, which applies equally to

rectangular matrices.

T H E O R E M 4.Any matrix A is uniquely expressible in the form

A = 2 aU

a

,

this being a finite sum over real values of cc,where U\ = U*, U*U = 0and U

a

U^ =0

Proof.

The condition on the

U's

can be written comprehensively as

u

a

up

Y

= s

afi

s

fir

u

a

.

For

V

a

U*U

a

= U

a

implies TJ*JJ

a

U*

= 17*. whence by uniqueness,

TJ\

=

V*

(as

U

a

U*

and U U

a

are both hermitian). AlsoU

a

U Up=0 and U

a

U^ U =0 respectively imply

U* U

p

= 0 and U

a

U

fi

= 0 by (2). Define

E

x

= I - (A*A - Aiy (A*A - ?J).

The matrixA*A is normal, being hermitian, and is non-negative definite. Hence the

non-zero

JE^'s

are its principal idempotent elements and

JE7

A

= 0 unless A^ 0.

T h u s

A*A =2A^

A

and

(A*

Ay =2A^

A

.

Hence A*A = [A*Ay A*A =SA

+

A^

A

= S E

x

.

A

Put U

a

= otr^AE ifa>0 and U

a

=

0

otherwise. Then

a

A>0

Also

if a, y > 0 U

a

U*

fi

U

y

= u-

1

fi-

1

y-

1

AE

at

E

pt

A*AE

yt

To prove the uniqueness suppose

A = 2

8

fi7

V

a

.

a>0

Then it is easily verified that the non-zero expressions V V

a

, fora >0, and / 2 l

7

?^

are the principal idempotent elements of A*A corresponding respectively to the

eigenvalues a

2

and 0. Hence

V*V

a

= E

a

2, where oc> 0, giving

tf. = *-H 2 / ^ ) F*F

a

=V

a

.

Polar representation of a matrix.Itis a well-known result (closelyrelatedto Autonne's)

that any square matrix is the product of a hermitian with a unitary matrix (see, for

example, Halmos 5)).

| ThusA* =


9/9

A generalized inverse for matrices 413

P u t

H =

2af/

a

U .

Then

H

is non-negative definite hermitian, since

U

a

=

0 unless

a>0, and H

2

=

AA*. (Such an H must be unique.) We have WH = HW = ^44

+

.

Now AA

i

an d ^4

+

JL are equivalent und er a un itary transforma tion, both being

herm itian and h aving the same eigenvalues; i .e. there is a unitary m atrix

W

satisfying

WA*A = AA'W. Put t ing V= H*A + W- WA*Aw e see th at VV* = I andA = HV.

This is the

polar

representationof

A.

The polar representation is unique if A is non-singular but not if A is singular.

How ever, if we require

A = HU,

where i? is(asbefore) non-negative definite herm itian,

U

f

= U* and UU* = WH, the represen tation is always un iqu ej and also exists for

recta ng ular ma trices. The uniqueness of If follows from

AA*

=

HUU*H

=

H(Hm)H = H

2

,

and that of

U

from

WA

=

WHU

=

UU*U

=

U.

The existence of

H

and J7 is established by

H

= Sa?7

a

Z7*an d J7 = SC/

a

.

If we put G =SaC/

a

*U

a

we obtain the alternative representation A = [/(?.

I wish to th an k D r M. P . Drazin for useful suggestions and for advice in the prepa ra-

tion of this paper.

REFERENCES

(1) AUTONNB, L. Sur les matrices hypohermitiennes et sur les matrices unitaires. Ann. Univ.

Lyon,

(2), 38 (1915), 1-77.

(2)

BJEBHAMMAB,

A. Rec tangu lar reciprocal matr ices, with special reference to geodetic calcula-

tions.

Bull.

gdod.

int. (1951), pp. 188-220.

(3)

CECIONI,

F . Sopra operazioni algebriche. Ann. Scu. norm. sup. Pisa, 11 (1910), 17-20.

(4) DBAZTN,M. P . On diagonable and norm al matrices. Quart. J. Math.(2), 2 (1951), 189-98.

(5) HAIMOS, P. R. Finite dimensionalvector spaces (Princeton, 1942).

(6) MURRAY, F. J. and VON NEUMANN,J. On rings of operators. Ann. Math., Princeton,(2),

37 (1936),

141-3.

(7) WEDDERBTJBN,J. H. M. Lectures onmatrices (Colloq. Publ. Amer. m ath .

Soc.

no. 17, 1934).

S T JOHN'S COLLEGE

CAMBRIDGE

Documents

Art Penrose