Matrix Algebra Notes

Embed Size (px)

Citation preview

  • 7/21/2019 Matrix Algebra Notes

    1/25

    Econometrics - II

    Indira Gandhi Institute of Development ResearchJanuary - May Semester 2013

    Subrata Sarkar c

    Elements Of Matrix Algebra

    Start with an example

    Yt= 1+ 2X2t+ 3X3t+ + kXkt+ Ut t= 1, 2, n

    Writing for each observation

    Y1 = 1+ 2X21+ 3X31+ + kXk1+ U1

    Y2 = 1+ 2X22+ 3X32+ + kXk2+ U2...

    Yn = 1+ 2X2n+ 3X3n+ + kXkn+ Un

    Summarize these n equations in a convenient form

    Y1Y2...

    Yn

    =

    1 X21 Xk11 X22 Xk2...

    ... ...

    1 X2n Xkn

    12...

    k

    +

    U1U2...

    Un

    or,Y =X + U

    Y, X,, U are vectors and matrices

    Vector: An ordered sequence of numbers arranged in a row or a column

    Y , , U arranged in column

    U, Y n element column vector k element column vector

    1

  • 7/21/2019 Matrix Algebra Notes

    2/25

    Y = [Y1, Y2 Yn]transpose ofYU = [U1, U2 Un] transpose ofU

    = [1, 2 k] transpose of

    X=

    1 X21 Xk11 X22 Xk2...

    ... ...

    1 X2n Xkn

    X is a matrix

    Matrix: A rectangular array of elements.Order of a Matrix = number of rows number of columns

    =n k (number of rows being always written first)

    Observations:

    1. A column vector of n elements i.e.

    Y =

    Y1Y2...Yn

    is a matrix of ordern 1

    2. A row vector of k elements i.e. Z = [Z1 Z2 Zk] is a matrix of order1 k

    3. Representing the X matrix

    X= [X1n1 X2n1 Xkn1 ]

    or

    X=

    S1S2...

    Sn

    2

  • 7/21/2019 Matrix Algebra Notes

    3/25

    4. Transpose of a matrix

    Xnk= [Xij] Xkn = [Xji]

    Example

    X=

    1 6 43 2 24 1 15 3 5

    X =

    1 3 4 56 2 1 3

    4 2 1 5

    1. Operations on Vectors

    (a) Multiplication by a scalar

    2

    23

    4

    =

    2 22 3

    2 4

    =

    46

    8

    (b) Addition of two vectorsU+ V= Sum of corresponding elementsorder has to be the same

    (c) Linear combination K1U +K2V where K1 and K2 are con-stants.

    (d) Vector Multiplication

    ab= [1 2 3]

    456

    = 1 4 + 2 5 + 3 6

    a = 1 3b = 3 1

    } number of elements have to be same

    3

  • 7/21/2019 Matrix Algebra Notes

    4/25

    A special vector: S the sum vector

    S=

    11...1

    thereforeas=

    ni=1

    ai

    n 1

    2. Operations on Matrices

    (a) Multiplication by a scalarK A= {Kaij}

    (b) Addition of two matrices sum of corresponding elements

    (c) Equality of Matrices orders have to be same

    (d) Matrix Multiplication: Ank Bkm

    A Bn k k m

    n m

    =

    a11 a1ka21 a2k

    ...an1 ank

    b11 b1mb21 b2m

    ...bk1 bkm

    =

    a1b1 a

    1bma2b1 a

    2bm...

    anb1 a

    nbm

    The two matrices have to be conformable.

    An example: 2 33 1

    4 2

    6 32 2

    =

    2 6 + 3 2 2 3 + 3 23 6 + 1 2 3 3 + 1 2

    4 6 + 2 2 4 3 + 2 2

    3 2 2 2

    A special case

    a11 a1m

    ...an1 anm

    12...

    m

    =

    a

    1a2...

    an

    4

  • 7/21/2019 Matrix Algebra Notes

    5/25

    3. Some Special Matrices

    (a) Diagonal matrix

    Ann=

    a11a22

    . . .

    ann

    has to be square

    (b) The identity matrix

    Inn=

    11

    . . .

    1

    nn

    (c) Symmetric matrixA = A

    (d) A scalar matrix

    =

    . . .

    =I

    (e) Idempotent matrix. Has to be square

    A = A2

    A = A2

    =A3

    5

  • 7/21/2019 Matrix Algebra Notes

    6/25

    4. Some Properties of Matrices

    (a) (AB) =B A, (ABC) =CBA

    (b) (A + B) + C=A + (B+ C)

    (c) (AB)C=A(BC)

    (d) A(B+ C) =AB+ AC

    (e) AI=A

    (f) (A + B) =A + B

    5. Trace of a Square Matrix

    tr(A) =n

    i=1

    aij

    Properties of Trace

    tr(ABC) =tr(BC A) =tr(CAB)

    6. Matrix Inverse

    In algebra we have ab = 1 b = 1a

    In matrix algebra we ask that given Ann , does there B |AB= In?

    Answer: If columns of A are linearly independent, then there B suchthat AB = I. In that case B is denoted as A1 i.e. AA1 =I

    6

  • 7/21/2019 Matrix Algebra Notes

    7/25

    Linear Independence: a1, a2, anare linearly independent. If not then

    ai can be written as a linear combination of the other ais

    Theorem: If all columns of A are linearly independent, then so are allthe rows. Then there exists C such that

    CA= I

    NowC = CI

    = CAB

    = IB

    ThereforeC = B. = A

    1

    Therefore if A is square matrix with all columns (rows) linearly in-dependent, then there exists a unique matrix called the inverse of A,denoted by A1 such that

    AA1 =A1A= I . A is non singular

    7. Properties of Inverse

    (a) [A1]1 =A

    (b) [A]1 = [A1]

    (c) [AB]1 =B1A1

    8. Calculations of Inverse

    A=

    a11 a12a21 a22

    Replace each element by its minor a22 a21a12 a11

    7

  • 7/21/2019 Matrix Algebra Notes

    8/25

    Sign the minors -, i.e. get cofactors (1)i+j a22 a21a12 a11

    Transpose a22 a12a21 a11

    Adj(A)

    Get determinant a11a22 a12a21=|A |

    Divide each element ofAdj(A) by| A |

    Therefore A1 = 1

    |A |[adj(A)]

    A=

    a11 a12 a13a21 a22 a23a11 a31 a33

    Step 1: Minor

    (a22a33 a23a32) (a21a33 a31a32) (a21a32 a22a31)

    Step 2: Cofactor

    + + + + +

    Step 3: Transpose Adjoint

    8

  • 7/21/2019 Matrix Algebra Notes

    9/25

    Step 4: Determinant

    |A | = a11[a22a33 a23a32]

    = a12[a21a33 a23a31]

    = +a13[a21a32 a22a31]

    Step 5: Inverse

    Divide every element of Adjoint (Step 3) by determinant (Step 4).

    9. The Rank of a Matrix

    The rank of a matrix A, not necessarily square, is the maximum num-ber of linearly independent columns (or rows).

    The maximum number of linearly independent columns of A = Themaximum number of linearly independent rows of A.

    rank is unique and is denoted by (A)

    (A) min[m, n]

    When(A) =m < n A has full row rank

    When(A) =n < m A has full column rank

    If A is square matrix of order n, with full row(column) rank then A isnon-singular

    Example A

    1 2 3 41 0 1 12 2 4 53 6 7 4

    9

  • 7/21/2019 Matrix Algebra Notes

    10/25

    Summary of Basic Matrix Algebra

    1. Matrix: A rectangular array of elements.

    A=

    1 2 3 24 5 6 7

    7 8 9 2

    34

    ={aij}

    A is a 3 (rows) 4 (columns) matrtix

    2. Row vector: x= [1 2 3 4]14

    3. Column vector: y=

    78

    9

    31

    4. Diagonal Matrix: D=

    1 2

    3

    5. Symmetric C matrix: {aij}= {aji}

    A= 1 22 7

    6. Transpose of a matrix:

    A=

    1 2 34 5 6

    23

    A =

    1 42 53 6

    32Symmetric matrix : A= A

    10

  • 7/21/2019 Matrix Algebra Notes

    11/25

    7. Rank of Matrix: The number of linearly independent rows (columns)

    Rank Am n

    min[m, n]

    8. Square Matrix: Ann

    if Rank A = n Then A has an inverse :

    AA1 =I=A1A

    whereInn=

    1

    11

    . . .

    1

    9. Addition of matrices: An m and Bn m

    same order

    A + B = {aij} + {bij}= {aij+ Bij}

    10. Multiplication:

    Anm Bmp= ABnp.[Example]

    conformable

    11. (AB) =B A

    12. (AB)1 = B1A1 assuming A and B are square matrices with fullrank

    11

  • 7/21/2019 Matrix Algebra Notes

    12/25

    Quadratic Form and Matrix Derivatives

    1. Quadratic Form

    Consider the expression q1 = 2X21 +X1X2+ X

    23 Calling X the

    column vector of Xs, ie.X = [X1, X2 Xn], a quadratic form can beput in the form q= XAXwith A symmetric. A is uniqueonce theorder of X is chosen. A has

    in the diagonal ie. aii, the coefficient attached to X2i

    in the off-diagonal, aij, 12 , the coefficient attached to XiXj

    In our example: A=

    2 1/2 01/2 0 0

    0 0 1

    Example 1:A =

    2 11 1

    XAX = 2X21 + 2X1X2+ X22 0 X

    = (X1+ X2)2 + X21 >0 X= 0

    Example 2:A =

    1 11 1

    XAX = X21 + 2X1X2+ X22

    = (X1+ X2)2 0 X

    but X= 0 such that XAX= 0

    Definition:

    12

  • 7/21/2019 Matrix Algebra Notes

    13/25

    A quadratic form is said to be positive definite ifXAX >0 for

    allX= 0 A quadratic form is said to be positive semi-definite ifXAX0

    for all X and X= 0such thatXAX= 0

    Remarks:

    (a) A matrix is said to be n.n.d if it is either P.D or P.S.D.

    (b) The concept of n.d and n.s.d can be defined similarly (by reversingthe sign).

    (c) A symmetric matrix A is said to be P.D (P.S.D) if the associatedquadratic form is in P.D (P.S.D)

    There are three equivalent conditions for a symmetric matrix A to beP.D. These are iffconditions

    (a) The matrix A is non-singular

    (b) a non singular matrix P, such that PP =A

    (c) a non singular matrix Q, such that Q=I

    Some more properties related to Quadratic Forms

    (a) Let B be any n k matrix. Then

    i. BB (order ofk k) is n.n.d

    ii. BB is p.d if rank(B) = k

    iii. BB is p.s.d if rank(B) < k

    (b) Ais p.dB is n.n.d

    A + B is p.d.

    (c) A p.d. n nB any n k

    13

  • 7/21/2019 Matrix Algebra Notes

    14/25

    i. BAB (order ofk k) is n.n.d

    ii. B

    AB is p.d if rank(B) = kiii. BAB is p.s.d if rank(B) < k

    2. Matrix Derivatives

    (a) Scalar Function

    Y = f(X) where X is a vector

    Y = AX1 X2 =f(X1, X2) X=

    X1

    X2 Definition

    Y

    X =

    Y/X1Y/X2...Y/Xn

    gradient vector

    Y

    X=

    Y

    X1

    Y

    Xn

    gradient vector

    Our example YX

    = AX11 X2AX1 X

    12

    i. Special case: Linear function

    Y =P1X1+ P2X2+ + PnXn= PX=XP

    Y

    X= [P1 P2 Pn] =P

    Y

    X = P

    ii. Special case: Quadratic form

    Y = XAX A symmetric

    Y

    X = 2AX Example A=

    2 11 1

    14

  • 7/21/2019 Matrix Algebra Notes

    15/25

    (b) Vector Function

    Ym1= Fm1(Xn1)

    Y1 = F1(X)

    Y2 = F2(X)...

    Ym = Fm(X)

    Then

    Y

    X = Y1X1

    Y1X2

    Y1Xn

    .

    ..YmX1

    YmX2

    YmXn

    The Jacobian Martix

    Special case: Linear vector functions

    Y1 = P

    1X

    Y2 = P

    2X...

    Ym = P

    mX

    Y =P X P =

    P1P2...

    Pm

    Then Y

    X=P

    X = IX

    Y

    X= I

    15

  • 7/21/2019 Matrix Algebra Notes

    16/25

    (c) Application of Derivative

    q = 1

    2XAX+ bX+ c where A is p.d.

    q

    X =

    1

    22AX+ b

    q

    X = AX+ b= 0 (F.O.C)

    Therefore AX = b

    X = A1b [A1 since a is p.d.]

    Also 2

    qXX

    =A is p.d. X defines a minimum of q

    (The Hessian Matrix)

    Proof:

    Let X=X + Z

    Then XAX= (X+Z)A(X+Z) =XAX+2XAZ+ZAZ

    Therefore q = 1

    2XAX+2XAZ+

    1

    2ZAZ+ b(X + Z) + c

    q = 1

    2XAX + bX + c + XAZ+

    1

    2ZAZ+ bZ

    q = q + (bA1)AZ+1

    2ZAZ+ bZ

    = q bZ+1

    2ZAZ+bZ

    q = q +1

    2

    ZAZ >0 for Z= 0

    Thereforeq > q for X=X

    16

  • 7/21/2019 Matrix Algebra Notes

    17/25

    Matrix Statistics

    (a) Random Vectors and Matrices

    IfX1, X2 Xn are random variables then X=

    X1X2

    ...Xn

    is a ran-

    dom vector. Elements are r.v.s.

    Likewise, W ={Wij} is a random matrix when Wijs are all ran-dom variables.

    (b) Expectation

    E(X) = [E(Xi)] =

    E(X1)E(X2)

    ...E(Xn)

    Let E(Xi) = for all i. E(X) =

    E(X1)

    ...E(Xn)

    =

    ...

    =Sn

    sum vector

    Properties of Expectation: Let A, B, C, U be constants

    i. E(U) =U

    ii. E(AX) =AE(X)iii. E(X+ Y) =E(X) + E(Y)

    iv. E(BX C) =BE(X)C

    v. E(W1W2) =E(W1)E(W2) when W1 andW2 are iid

    17

  • 7/21/2019 Matrix Algebra Notes

    18/25

    (c) Variance and Co-variance Matrix

    X=

    X1X2

    ...Xn

    Let E(X) =

    E(X1)...

    E(Xn)

    =

    1...

    n

    Therefore E(Xi) =i

    i. There are n expectations.

    ii. There are variances and co-variances

    there are n variances E[Xi i]2 =ii> 0

    there are (n)(n 1)2

    covariances E[Xii][Xj j ] =

    ij i=j ij =ji

    The variance-covariance matrix of X can be written as

    V(X) = E[(X )(X )] where E(X) = =

    U1U1...

    Un

    = {E(Xi i)(Xj j)}= {ij} V

    In the diagonal we have variances and in the off-diagonal we havecovariances. The matrix is symmetric.

    Remarks:

    i. If the Xis are uncorrelated then ij = 0 i, j, i = j.ThenV= diag{ii}

    ii. In addition, if there is homoskedasticity i.eii= 2 i, then

    V =2In

    iii. IfE(X) = 0 then V(X) =E(XX)

    (d) Linear Transformation

    Consider X withE(X) =, V(X) =VDefineY =AX is a L.T of X

    18

  • 7/21/2019 Matrix Algebra Notes

    19/25

    Then E(Y) = A

    V(Y) = AV A

    Proof: Y =AX

    E(Y) = E(AX) =AE(X) =A

    V(Y) = E(Y E(Y))(Y E(Y))

    = E[AX A][AX A]

    = E[A(X )][A(X )]

    = E[A(X )(X )A]

    = A E(X )(X )

    A

    = AV A

    Now consider the scalar linear transformation Y =ZX

    V(Y)>0 if Y is not a constant. i.e. ifZXis not a constant.

    V(Y) = 0 if Y is a constant. i.e. if ZX is a constant. i.e X are linearlydependent.

    V(Y) =ZV Z >0 Z= 0 if the Xi are linearly independent= 0 if theXi are linearly dependent

    Conclusion: The variance-covariance matrix is always p.d, except in caseswhere the Xis are linearly dependent, in which case it is a p.s.d matrix.

    Corollary:E(X) = V(X) =V V positive def.Then it is possible to get a standard vector through a linear transformation.i.e

    E(Y) = 0V(Y) =I

    Define:Y = Q[X ]

    E(Y) = QE(X ) =Q[E(X) ] = 0

    V(Y) = QV Q =I for some Q

    19

  • 7/21/2019 Matrix Algebra Notes

    20/25

    (e) Expectation of a Quadratic Form

    XE(X) = 0 V(X) =V

    q= XAXwhere A is a symmetric matrix of constants.

    ThenE(q) =E(XAX) =tr AV

    Trace: trace of a square matrix is the sum of its diagonal elements.

    Proof: XAX is a scalar, and so equal to its trace.

    XAX=tr(XAX)

    E(XAX) =E(tr XAX)

    Trace is commutative

    E(XAX) =E(tr AXX)

    Trace is a linear operator

    E(XAX) = tr E(AXX)

    = tr AE(XX)

    = tr AV

    Example:V =2InA is idempotent of rank K

    Therefore E(XAX) =tr A2I=2tr A= 2K

    [Rank (A) = tr (A) since A is idempotent]

    (f) Multinomial Normal Distribution and Related Distributions

    20

  • 7/21/2019 Matrix Algebra Notes

    21/25

    i. Introduction

    Xi i = 1, 2, n be n independent normal random vari-ables with

    E(Xi) =iV(Xi) =

    2i

    Xi N(i, 2i )

    A. Density ofXi f(Xi) = 1

    22ie

    1

    22i

    (Xii)2

    B. Joint density ofX1, X2 Xn, whenXis are independent

    is the product of the individual densities.

    f(X1, X2 Xn) = (2)n

    2 (n

    i=1

    2i )

    1

    2 e

    1

    2

    i

    1

    2i

    (Xii)2

    Lets write the above in vector-matrix notation

    X=

    X1X2

    ...Xn

    E(X) =

    12...

    n

    =

    V(X) =

    21

    . . .

    2n

    = V

    Note:

    |V|= 21 22

    2n , V

    1 =

    1/21. . .

    1/2n

    i1

    2i(Xi i)

    2 = 1

    21(X1 1)

    2 + 1

    22(X2 2)

    2 + + 1

    2n(Xn n

    is a quadratic form (X )V1(X )

    Therefore f(X1 Xn) = (2)n

    2 |V|1

    2 e1

    2(X)V1(X)

    21

  • 7/21/2019 Matrix Algebra Notes

    22/25

    ii. Formal definition: The random vector X, withE(X) = and

    V(X) =V, is said to be normally distributed iff:f(X) = (2)

    n

    2 |V|1

    2 e1

    2(X)V1(X)

    We then write XN(, V)

    When XN(0, In) then we say X is a standard normalvector.

    iii. Properties of normal distribution

    A. If XN(, V) and Y = AX +

    m 1 m n n 1 m 1

    Then Y N(A + ,AVA

    )E(Y) = A +

    V(Y) = AV A is p.d. sincer(A) =m

    B. The orthogonal transformation of a standard normal vec-tor is also a standard normal vector.

    Z = CX orthogonal transformation

    n 1 n n n 1 C1 =C

    CC1 =CC

    CC =ICC =I

    E(Z) = CE(X) = 0

    V(Z) = CV C

    = CIC

    = CC

    = I

    Corollary: If X N(, V) then by a suitable transfor-mation we can get a S.N.V.

    Since V is pd Q|QV Q =I

    Y =Q(X ) E(Y) = 0

    V(Y) =QV Q =I

    22

  • 7/21/2019 Matrix Algebra Notes

    23/25

    Y N(0, I)

    C. For normal variables zero covariance independence

    XN(, V) X=

    X1X2

    SX1(nS)X1

    with E(X) =

    E(X1)E(X2)

    =

    12

    V(X) =

    V11 V12V21 V22

    Zero covariance between X1, X

    2 V

    12= V

    21= 0

    Then f(X) = (2)n

    2 (|V11||V22|)

    1

    2 e

    1

    2[(X11)(X22)]

    V111

    V122

    X1 1X2 2

    = (2)n

    2 |V11|

    1

    2 |V22|

    1

    2 e1

    2[(X11)V

    1

    11 (X11)+(X22)V

    1

    22 (X22)]

    = (2)S

    2 |V11|

    1

    2 e1

    2(X11)V

    1

    11 (X11).(2)

    nS

    2 |V22|

    1

    2

    e1

    2(X22)V

    1

    22 (X22) indep.

    iv. The chi-squared

    XN(0, In) independent

    X21 + X22 + + X

    2n

    2(n)

    23

  • 7/21/2019 Matrix Algebra Notes

    24/25

    Characterization

    A. X21 + X

    22 + X

    2n X

    XThereforeX N(0, In)

    XX 2(n)

    B. Y N(, V)

    (Y )V1(Y ) 2(n)

    Since V is p.d , Q|QV Q =I

    X=Q(Y ) X is normal

    E(X) = 0 V(X) =QV Q

    =I

    ThereforeX N(0, I)

    XX 2(n)

    (Y )QQ(Y ) 2(n)

    QV Q= I

    Q1QV Q =Q1

    V Q =Q1

    V(Q)(Q)1 =Q1(Q)1

    V = (QQ)1

    V1 =QQ)

    Therefore (Y )V1(Y ) 2(n)

    C.

    Z N(0, In)

    Z = Z1

    Z2 SX1

    (nS)X1

    Z1 N(0, IS) Z2 N(0, InS) independent

    Z1Z1 2(S) Z

    2Z2 2(nS)independent

    24

  • 7/21/2019 Matrix Algebra Notes

    25/25

    Now observe

    Z1 = [IS0]Z Z1 = AZ

    S 1 S n n 1

    Z1Z1= ZAAZ = Z

    IS

    0

    [IS0]Z=Z

    IS 00 0

    Z

    Therefore Z1Z1 = ZMZ with M idempotent of r(M) = S

    ZMZ2(S)

    Z2 = [0 InS]

    Z1Z2

    = Z

    Z2Z2 = Z

    0 00 InS

    Z=ZMZ M ID r(M) =n S

    Therefore ZMZ 2(nS)

    Also M =I M

    MM =M(I M) =M M.M=M M= 0

    Theorem: If Z N(0, In) and M is an idempotent matrtix of rankS, then

    ZMZ 2(S)

    Z(1 M)Z 2(nS)

    and 2(S) and 2(nS) are independent since M(I-M) = 0

    25