Art Penrose

Embed Size (px)

Citation preview

  • 7/26/2019 Art Penrose

    1/9

    Mathematical Proceedings of the CambridgePhilosophical Societyhttp://journals.cambridge.org/PSP

    Additional services for Mathematical Proceedings of theCambridge Philosophical Society:

    Email alerts: Click here

    Subscriptions: Click hereCommercial reprints: Click hereTerms of use : Click here

    A generalized inverse for matrices

    R. Penrose

    Mathematical Proceedings of the Cambridge Philosophical Society / Volume 51 / Issue 03 / July 1955, pp 406- 413DOI: 10.1017/S0305004100030401, Published online: 24 October 2008

    Link to this article: http://journals.cambridge.org/abstract_S0305004100030401

    How to cite this article:R. Penrose (1955). A generalized inverse for matrices. Mathematical Proceedings of theCambridge Philosophical Society, 51, pp 406-413 doi:10.1017/S0305004100030401

    Request Permissions : Click here

    Downloaded from http://journals.cambridge.org/PSP, IP address: 148.228.124.31 on 08 Jun 2016

  • 7/26/2019 Art Penrose

    2/9

    [ 4 6

    A GENERALIZED INVERSE FOR MATRICES

    B Y R. PENROSE

    Communicated by

    J .

    A. T O D D

    Received26 July 1954

    This paper describes

    a

    generalization of the inverse of

    a

    non-singular m atrix, as the

    unique solution ofa certain set of equations. This generalized inverse exists for any

    (possibly rectangu lar) m atr ix w hatsoever with complex elements J. I t is used here for

    solving linear m atr ix eq ua tion s, an d among other applications for finding an expression

    for the p rincipal idem po ten t elem ents of a m atrix. Also a new typ e of spectral decom-

    position is given.

    In another paper itsapplicat ion to substitutional equations and thevalueof

    hermitian idempotents will be discussed.

    Notation. Ca pital letter s alw ays deno te matrices (not necessarily square) with com-

    plex elements. The conjugate transpose of the matrixAis writte n A*. Small letters

    are used for column vectors (with an asterisk for row vectors) and small Greek letters

    for complex numbers.

    The following properties of the conjugate transpose will be used:

    A** =A,

    (A+B)* = A*+B*,

    (BA)* = A*B*,

    AA* =0implies

    4

    = 0.

    Th e last of these follows from th e fact th a t the trace ofAA* is th e sum of the squ ares

    of the moduli of the elements of

    A.

    Erom the last two we obta in the rule

    BAA* = CAA* implies BA =G A, (1)

    since (BAA*-CAA*)(B-C)* =(BA-CA)(BA-GA)*.

    Similarly BA*A = CA*A implies BA* =GA*. (2)

    T H E O R E M \.The four equations AX A A C\\

    XAX =X, (4)

    {AX)*

    =AX, (5)

    {XA)*

    =XA, (6)

    have a unique solution for any

    A.

    %

    Matrices over more general rings will be considered in

    a

    later paper.

  • 7/26/2019 Art Penrose

    3/9

    A generalized inverse

    for

    matrices

    407

    Proof. Ifirst show that equations(4) and(5)areequivalentto thesingle equation

    XX*A* =

    X.

    (7)

    Equation (7)follows from (4) and(5), sinceit ismerely (5)substituted in(4). Con-

    versely, (7)implies

    AXX*A*

    =

    AX,

    th eleft-hand sideofwhichishermitian. Thus

    (5) follows, and substituting (5)

    in

    (7) we get (4). Similarly, (3)

    and

    (6) can be replaced

    bytheequation XAA*

    = A*.

    (8)

    Thusitis sufficient tofindan

    X

    satisfying (7)and(8). Suchan

    X

    will existif a

    B

    can

    be found satisfying

    M

    , ,

    3 B

    BA*AA* = A*.

    For then

    X

    =

    BA*

    satisfies (8).Also,wehave seen that (8)implies

    A*X*A*

    =

    A*

    and therefore

    BA*X*A*

    =

    BA*.

    Thus

    X

    also satisfies(7).

    Nowtheexpressions

    A*A,

    (A*A)

    2

    , (A*A)

    3

    ,... cannotall be linearly independent,

    i.e. there existsarelation

    A

    1

    A*A +\

    2

    (A*A)*+...+A

    k

    (A*A)

    k

    =0

    >

    (9)

    where A,, ..., A

    ft

    are not allzero. Let

    A,

    bethefirst non-zero

    A

    and pu t

    =

    V

    Thus

    (9)

    gives

    B(A*A)

    r+1

    =

    (A*A)

    T

    ,

    and

    applying

    (1) and (2)

    repeatedly

    we

    obtain

    BA*AA*

    =

    A*,

    asrequired.

    To show th at

    X

    is unique, we suppose that

    X

    satisfies (7)and(8) and tha t

    Y

    satisfies

    Y

    =

    A*Y*Y

    and

    A* =A*AY.

    These last relations are obtained by respectively

    substituting (6)in(4)and(5)in

    (3).

    (They are (7)and(8) with

    Y

    in place of X and the

    reverse order

    of

    multiplication and must,

    by

    symmetry, also

    be

    equivalent

    to

    (3),

    (4),

    (5)and(6).)Now

    X = XX*A*

    =

    XX*A*AY = XAY

    =

    XAA*Y*Y = A*Y*Y

    =

    Y.

    The unique solutionof

    (3),

    (4), (5)and(6) will be called the generalized inverse of

    A

    (abbreviated g.i.) and written

    X =A* .

    (Note that

    A

    need not beasquare matrixand

    may evenbezero.) Ishall also usethenotation

    A*

    forscalars, where

    A

    +

    means A

    1

    if

    A

    +OandOif

    A

    = 0.

    In the calculation of

    A*

    it

    is

    only necessary

    to

    solve the

    two

    unilateral linear equations

    XAA* =

    A*

    and

    A*A Y = A*.

    Byputting

    A*

    =

    XA Y

    andusingth efact that

    XA

    and

    A Y

    arehermitian andsatisfy

    AX A = A

    =

    A YA

    we observe that the four

    relations

    AA^A

    =

    A,

    A'

    t

    AA

    i

    =

    A*,

    (AA

  • 7/26/2019 Art Penrose

    4/9

    4 0 8 R . P EN R O S E

    Note that the properties

    (A+B)* = A* +B*

    and

    (AA)*

    =

    A A*

    were not used in

    the proof of Theorem 1.

    LEMMA.

    1-1

    A

    n

    = A;

    1-2

    A** = A**;

    1-3 if A is

    non-singular

    A

    f

    = A-

    1

    ;

    1-4 (/U)

    +

    1-6

    (A*Ay

    =

    1-6 ifUandV

    are

    un itary (UA V)

    f

    =

    V*A

    t

    V*;

    1-7 if A =~LA

    it

    where

    A

    i

    A*

    = 0 and ^ 4 * ^ = 0,

    whenever

    i+j then

    A*

    =-LAX;

    1-8 i/

    A is

    normal

    A

    f

    A

    =

    AA''

    and

    (A

    n

    )

    f

    = (

    +

    );

    1-9 A, A*A,

    A*

    and A

    i

    A

    all have rank equal to

    traceA*A.

    Proof. Since the g.i. of a matrix is always unique, relations 1-1,...,1-7 can be

    verified by substituting the right-hand side of each in the denning relations for the

    required g.i. in each case. For 1-5 we require (10) and for 1-7 we require the fact that

    AiAlj = 0andA\A

    }

    =0ifi#j This follows fromA]=A?A}*A)andA\ = A\A\*A\.

    The first p ar t of 1-8, of which the second pa rt is a consequence, follows from 1-5 and

    (10), since

    A*

    A = (A*A)

    f

    A*A and AA

    1

    = (AA*)

    f

    AA*. (The condition for A to be

    normal is

    A*A =

    AA*.)%To prove 1-9 we observe th at each of

    A, A*A, A*

    and

    A

    f

    A

    can be obtained from the others by pre- and post-multiplication by suitable matrices.

    Thus their ranks must all be equal and

    A''A

    being idempotent has rank equal to its

    trace.

    Relation 1-5 provides a method of calculating the g.i. of a matrix A from the g.i.

    of

    A* A,

    since

    A*

    =

    (A*

    Ay

    A*.

    Now

    A* A,

    being hermitian, canbereducedtodiagonal

    form by a unitary transformation. HenceA*A = UDU*, where U is unitary and

    D

    = diag(a

    1;

    ...,a

    n

    ). Thus Z>

    +

    = diag(aj, . . . , < ) and (4*.4)

    +

    = UD*U*by 1-6.

    An alterna tive proof of the existence of the g.i. of a square matrix can be obtained

    from Autonne's

    (1)

    result that any square matrix

    A

    can be written in the form

    VBW,

    where Vand Ware unitary andBis diagonal . Now

    B*

    exists since it can be denned

    similarly to D

    f

    above. It is thus sufficient to verify that A

    f

    =W*B*V* as in 1-6.

    A rectangular m atrix can be trea ted by bordering it with zeros to make it square.

    Notice thatA

    f

    is a continuous function ofAif the rank ofA is kept fixed , since in

    th e singular case the polynomial in (9) could be taken to be the characteristic function

    ofA*A. The value of

    r,

    being the nullity ofA *A(asA *Ais hermitian), is thus deter-

    mined by the rank ofA (by 1-9). AlsoA

    1

    varies discontinuously when the rank ofA

    changes since rankA = traceA*Aby 1-9.

    J This result, like several of the earlier ones, is a simple consequence of the construction of

    A* given in Theorem 1. It seems desirable, however, to show that it is also a direct consequence

    of (3), (4), (5) and (6).

    A simple proof of thi s resu lt is as follows: sinceAA* an d A*A are both hermitian and have

    the same eigenvalues there exists a unitary matrix T such that

    TAA*T*

    = A*A. Hence TA is

    nor ma l and therefore diagonable by uni tary transformation.

  • 7/26/2019 Art Penrose

    5/9

    A generalized inverse for matrices

    409

    THEOREM 2.

    A necessary and sufficient condition for the equation AXB = C to have

    a solution is AA

  • 7/26/2019 Art Penrose

    6/9

    410 R.

    P E N R O S E

    Proof.

    The proofs of

    2-1

    an d 2-2 are eviden t. To prove

    2-3

    suppose

    K

    2

    =K,

    then, by

    l-l,K = {(K*K) (KK')y.ThusE =KK'and F = Z'Zwilldo. Conversely,ifK = {FEf

    t hen

    K

    is of the form

    EFPEF

    (sinceQ*

    =

    Q*(Q

    f

    *Q

    f

    Q

    f

    *)Q* by (10)). HenceK

    =

    EKF,

    so tha t

    K

    2

    = E(FEy FE{FEy

    F

    = E(FE)

    f

    F

    =K.

    Principal idempotent elementsofama trix.

    For any square matrix

    A

    there exists

    a unique set

    of

    m atricesK

    x

    denn ed for each complex num ber

    A

    such that

    K

    x

    K^8

    Xll

    K

    x

    , (11)

    ZK

    X

    =I,

    (12)

    AK

    X

    =K

    X

    A, (13)

    (A-

    A/)

    K

    x

    is nilpotent, (14)

    the non-zero

    K

    x

    's

    being the

    principal idempotent

    elements

    of

    A

    (see Wedderburn(7))4

    Unless Ais

    an

    eigenvalue ofA, K

    x

    is zero so that the sum

    in

    (12)isfinite, tho ug h

    su m m ed over all complex A. Th e following theorem gives an explicit' formula forK

    x

    in terms ofg.i.'s.

    T H B O B E M

    3. / /

    E

    x

    = I-{(A-U)

    n

    y(A-XI)

    n

    and F

    x

    = / -(A-U )

    n

    { ( A - A/)

    n

    }

    +

    ,

    wheren issufficiently large (e.g. theorder of A), thenthe principal idempotent elements

    ofA.are, given by K

    x

    = (F

    x

    E

    x

    y,andncanbetakenasunityifandonlyifAisdiagonable.

    Proof.First, supposeA

    is

    diagonable and put

    E

    x

    = I-(A-Aiy(A-M)

    and F

    x

    =

    I-(A-M)(A-Aiy.

    ThenE

    x

    and F

    x

    are h .i. b y 2-1 and are zero unlessA is an eigenvalue ofA

    by

    1*3.

    N o w

    (AjuI)E =0 and F

    x

    {A-\I) =0, (15)

    so tha t

    [iF^p=F

    x

    AE

    /t

    =

    A - F ^ . T hus

    2 ^

    = 0 if A*/*. (16)

    Put t i ng

    K

    x

    = (F

    x

    E

    x

    )

    f

    we have,by 2-3,

    K

    x

    = E

    x

    (F

    x

    E

    x

    yF

    x

    . (17)

    HenceK =S

    X/l

    K

    x

    b y (16). Also (17) gives

    F^E, =

    S^S^F.E^

    (18)

    Any eigenvectorz

    a

    oiAcorresponding to the eigenvalueasatisfies E

    a

    z

    a

    =z

    a

    . Since

    A isdiagonable, an y colum n vecto rx conformable withA

    is

    expressible asasum of

    eigenv ectors; i .e.itis expressible in the form

    (a

    finite sum over all com plex A)

    x = ZE

    x

    x

    x

    .

    | The existence of suchK

    x

    s can be established as follows:

    let

    4>(E)= (A^

    1

    .. .

    ( |

    A^ '

    be,

    say,

    the

    minim um function

    ofA

    where

    the

    factors (

    A

    (

    )

    nl

    are

    mu tually coprime. Then

    if

    0()=(E,- A,)

  • 7/26/2019 Art Penrose

    7/9

    A generalized inverse

    for

    matrices

    411

    Similarly,

    if

    y*

    is

    conformable with

    A,i t

    is expressible

    as

    Now 2/*(ig

    x=

    (Xy*

    x

    F

    x

    )

    (S j

    by (18) and (16). Hence

    Eif^ = / .

    Also, from (15) and (17) we obtain

    AK

    X

    = XK

    X

    =

    K

    X

    A. (19)

    Thus conditions (13) and (14) are satisfied. Also

    A

    =

    2AZ

    A

    .

    (20)

    Conversely,

    if it is no t

    assumed that

    A is

    diagonable but that

    Y,K

    X

    = / ,K

    x

    being

    defined as above (i.e. with

    n= 1),

    then

    x=

    HK

    x

    x gives

    a;

    as

    a

    sum of eigenvectors of

    A,

    since (19) was deduced without assuming the diagonability

    ofA.

    If

    A

    is not diagonable

    i t

    seems more convenient simply to prove tha t

    for

    any set of

    K

    x

    s

    satisfying (11), (12), (13) and (14) each

    K

    x

    =

    (F

    x

    E

    x

    y, whereF

    x

    and

    E

    x

    are defined

    as

    in

    the theorem.

    Now,

    if

    the iC

    A

    's satisfy (11), (12), (13)

    a nd

    (14) they must certainly satisfy

    and (A

    - XI)

    n

    K

    x

    =

    0 =

    K

    X

    (A

    -

    A/) ,

    where

    n

    is sufficiently large (ifnis the order

    ofA,

    this will do).

    From this and the definitions of

    E

    x

    andF

    x

    given in the theorem we obtain

    E

    X

    K

    X

    = K

    X

    =K

    X

    F

    X

    . (21)

    Using Euclid's algorithm we can find

    P

    and Q(polynomials

    inA)

    such that

    I = (A

    - M)

    n

    P

    +Q(A - pi)*,

    provided that A#/t. Now F

    x

    (A-XI)

    n

    = 0 =

    (A-piy-E^. Hence F^E^

    = 0 if

    A=t=/t,

    which with (21) givesF^K^

    = 0 =

    K^E whenever A#/t. But SZ

    A

    = / .

    Thus

    and K

    X

    E

    X

    = E

    K

    .\

    ( 2 2 )

    Using (21)

    and

    (22)

    i t is a

    simple matter

    to

    verify that (-F

    A

    ^

    A

    )

    f

    =

    K

    K

    . (Note that

    iT

    A

    =

    / - { ( / - #

    A

    ) ( / - ^

    A

    ) } t also.)

    Incidentally, this provides

    a

    proof

    of

    the uniqueness

    ofK

    x

    .%

    COROLLARY.

    / /

    A

    is

    normal,

    it

    isdiagonableand its principal idempotentelements

    are

    hermitian.

    Proof.Since

    A

    XI

    is

    normal we

    can

    apply

    1-8 to the

    definitions

    ofE

    x

    and

    F

    x

    in

    Theorem 3. This gives

    E

    X

    = I-(A-XIY(A-XI) and F

    x

    =I -

    (A

    -

    A/) (A

    - XI)\

    w h en ceA

    is

    diagonable. AlsoE

    x

    =

    F

    x

    ,so

    t h a t K

    x

    =E

    x

    is

    h e rmi t i an .

    %

    In

    fact, S.K

    A

    =

    I

    and

    (AXI)

    n

    K

    x

    = 0 are

    sufficient

    for

    uniqueness.

  • 7/26/2019 Art Penrose

    8/9

    412 R. PBNROSE

    Tor this result see, for example, Drazin(4).

    If A is normal (20) givesA =AJ?

    A

    , and thus using 1-7, 1-4 and 2-2 we obtain

    A new type of

    spectral

    decomposition.UnlessA is normal, there does not appear to be

    any simple expression forA*in terms of its principal idempotent elements. However,

    this difficulty is overcome with the following decomposition, which applies equally to

    rectangular matrices.

    T H E O R E M 4.Any matrix A is uniquely expressible in the form

    A = 2 aU

    a

    ,

    this being a finite sum over real values of cc,where U\ = U*, U*U = 0and U

    a

    U^ =0

    Proof.

    The condition on the

    U's

    can be written comprehensively as

    u

    a

    up

    Y

    = s

    afi

    s

    fir

    u

    a

    .

    For

    V

    a

    U*U

    a

    = U

    a

    implies TJ*JJ

    a

    U*

    = 17*. whence by uniqueness,

    TJ\

    =

    V*

    (as

    U

    a

    U*

    and U U

    a

    are both hermitian). AlsoU

    a

    U Up=0 and U

    a

    U^ U =0 respectively imply

    U* U

    p

    = 0 and U

    a

    U

    fi

    = 0 by (2). Define

    E

    x

    = I - (A*A - Aiy (A*A - ?J).

    The matrixA*A is normal, being hermitian, and is non-negative definite. Hence the

    non-zero

    JE^'s

    are its principal idempotent elements and

    JE7

    A

    = 0 unless A^ 0.

    T h u s

    A*A =2A^

    A

    and

    (A*

    Ay =2A^

    A

    .

    Hence A*A = [A*Ay A*A =SA

    +

    A^

    A

    = S E

    x

    .

    A

    Put U

    a

    = otr^AE ifa>0 and U

    a

    =

    0

    otherwise. Then

    a

    A>0

    Also

    if a, y > 0 U

    a

    U*

    fi

    U

    y

    = u-

    1

    fi-

    1

    y-

    1

    AE

    at

    E

    pt

    A*AE

    yt

    To prove the uniqueness suppose

    A = 2

    8

    fi7

    V

    a

    .

    a>0

    Then it is easily verified that the non-zero expressions V V

    a

    , fora >0, and / 2 l

    7

    ?^

    are the principal idempotent elements of A*A corresponding respectively to the

    eigenvalues a

    2

    and 0. Hence

    V*V

    a

    = E

    a

    2, where oc> 0, giving

    tf. = *-H 2 / ^ ) F*F

    a

    =V

    a

    .

    Polar representation of a matrix.Itis a well-known result (closelyrelatedto Autonne's)

    that any square matrix is the product of a hermitian with a unitary matrix (see, for

    example, Halmos 5)).

    | ThusA* =

  • 7/26/2019 Art Penrose

    9/9

    A generalized inverse for matrices 413

    P u t

    H =

    2af/

    a

    U .

    Then

    H

    is non-negative definite hermitian, since

    U

    a

    =

    0 unless

    a>0, and H

    2

    =

    AA*. (Such an H must be unique.) We have WH = HW = ^44

    +

    .

    Now AA

    i

    an d ^4

    +

    JL are equivalent und er a un itary transforma tion, both being

    herm itian and h aving the same eigenvalues; i .e. there is a unitary m atrix

    W

    satisfying

    WA*A = AA'W. Put t ing V= H*A + W- WA*Aw e see th at VV* = I andA = HV.

    This is the

    polar

    representationof

    A.

    The polar representation is unique if A is non-singular but not if A is singular.

    How ever, if we require

    A = HU,

    where i? is(asbefore) non-negative definite herm itian,

    U

    f

    = U* and UU* = WH, the represen tation is always un iqu ej and also exists for

    recta ng ular ma trices. The uniqueness of If follows from

    AA*

    =

    HUU*H

    =

    H(Hm)H = H

    2

    ,

    and that of

    U

    from

    WA

    =

    WHU

    =

    UU*U

    =

    U.

    The existence of

    H

    and J7 is established by

    H

    = Sa?7

    a

    Z7*an d J7 = SC/

    a

    .

    If we put G =SaC/

    a

    *U

    a

    we obtain the alternative representation A = [/(?.

    I wish to th an k D r M. P . Drazin for useful suggestions and for advice in the prepa ra-

    tion of this paper.

    REFERENCES

    (1) AUTONNB, L. Sur les matrices hypohermitiennes et sur les matrices unitaires. Ann. Univ.

    Lyon,

    (2), 38 (1915), 1-77.

    (2)

    BJEBHAMMAB,

    A. Rec tangu lar reciprocal matr ices, with special reference to geodetic calcula-

    tions.

    Bull.

    gdod.

    int. (1951), pp. 188-220.

    (3)

    CECIONI,

    F . Sopra operazioni algebriche. Ann. Scu. norm. sup. Pisa, 11 (1910), 17-20.

    (4) DBAZTN,M. P . On diagonable and norm al matrices. Quart. J. Math.(2), 2 (1951), 189-98.

    (5) HAIMOS, P. R. Finite dimensionalvector spaces (Princeton, 1942).

    (6) MURRAY, F. J. and VON NEUMANN,J. On rings of operators. Ann. Math., Princeton,(2),

    37 (1936),

    141-3.

    (7) WEDDERBTJBN,J. H. M. Lectures onmatrices (Colloq. Publ. Amer. m ath .

    Soc.

    no. 17, 1934).

    S T JOHN'S COLLEGE

    CAMBRIDGE