Singular Value Decomposition(matrix factorization)
Singular Value DecompositionThe SVD is a factorization of a ๐ร๐ matrix into
๐จ = ๐ผ ๐บ ๐ฝ๐ป
where ๐ผ is a ๐ร๐ orthogonal matrix, ๐ฝ๐ป is a ๐ร๐ orthogonal matrix and ๐บis a ๐ร๐ diagonal matrix.
For a square matrix (๐ = ๐):
๐จ =โฎ โฆ โฎ๐& โฆ ๐'โฎ โฆ โฎ
๐&โฑ
๐'
โฆ ๐ฏ&( โฆโฎ โฎ โฎโฆ ๐ฏ'( โฆ
๐จ =โฎ โฆ โฎ๐& โฆ ๐'โฎ โฆ โฎ
๐&โฑ
๐'
โฎ โฆ โฎ๐& โฆ ๐'โฎ โฆ โฎ
(
๐& โฅ ๐) โฅ ๐*โฆ
Reduced SVD
๐จ = ๐ผ ๐บ ๐ฝ๐ป =โฎ โฆ โฎ โฆ โฎ๐" โฆ ๐# โฆ ๐$โฎ โฆ โฎ โฆ โฎ
๐"โฑ
๐#0โฎ0
โฆ ๐ฏ"% โฆโฎ โฎ โฎโฆ ๐ฏ#% โฆ
๐ร๐๐ร๐ ๐ร๐
๐จ = ๐ผ๐น ๐บ๐น๐ฝ๐ป
What happens when ๐จ is not a square matrix?
1) ๐ > ๐
We can instead re-write the above as:
Where ๐ผ๐น is a ๐ร๐ matrix and ๐บ๐น is a ๐ร๐ matrix
Reduced SVD
๐จ = ๐ผ ๐บ ๐ฝ๐ป =โฎ โฆ โฎ๐" โฆ ๐#โฎ โฆ โฎ
๐&โฑ
๐.
0โฑ
0
โฆ ๐ฏ"% โฆโฎ โฎ โฎโฆ ๐ฏ$% โฆโฎ โฎ โฎโฆ ๐ฏ#% โฆ
๐ร๐๐ร๐ ๐ร๐
๐จ = ๐ผ ๐บ๐น๐ฝ๐น๐ป
2) ๐ > ๐
We can instead re-write the above as:
where ๐ฝ๐น is a ๐ร๐ matrix and ๐บ๐น is a ๐ร๐ matrix
In general:
๐จ = ๐ผ๐น๐บ๐น๐ฝ๐น๐ป๐ผ๐น is a ๐ร๐ matrix ๐บ๐น is a ๐ ร๐ matrix๐ฝ๐น is a ๐ร๐ matrix
๐ = min(๐, ๐)
Letโs take a look at the product ๐บ๐ป๐บ, where ๐บ has the singular values of a ๐จ, a ๐ร๐ matrix.
๐บ๐ป๐บ =๐"
โฑ๐#
0โฑ
0
๐"โฑ
๐#0โฎ0
=๐"$
โฑ๐#$
๐ร๐
๐ร๐ ๐ร๐
๐บ๐ป๐บ =
๐"โฑ
๐%0โฎ0
๐"โฑ
๐%
0โฑ
0=
๐"$โฑ
๐%$
0โฑ
00
โฑ0
0โฑ
0๐ร๐๐ร๐ ๐ร๐
๐ > ๐
๐ > ๐
Assume ๐จ with the singular value decomposition ๐จ = ๐ผ ๐บ ๐ฝ๐ป. Letโs take a look at the eigenpairs corresponding to ๐จ๐ป๐จ:
๐จ๐ป๐จ = ๐ผ ๐บ ๐ฝ๐ป๐ป๐ผ ๐บ ๐ฝ๐ป
๐ฝ๐ป ๐ป ๐บ ๐ป๐ผ๐ป ๐ผ ๐บ ๐ฝ๐ป = ๐ฝ๐บ๐ป๐ผ๐ป ๐ผ ๐บ ๐ฝ๐ป = ๐ฝ ๐บ๐ป๐บ ๐ฝ๐ป
Hence ๐จ๐ป๐จ = ๐ฝ ๐บ๐ ๐ฝ๐ป
Recall that columns of ๐ฝ are all linear independent (orthogonal matrix), then from diagonalization (๐ฉ = ๐ฟ๐ซ๐ฟ1๐), we get:
โข the columns of ๐ฝ are the eigenvectors of the matrix ๐จ๐ป๐จโข The diagonal entries of ๐บ๐ are the eigenvalues of ๐จ๐ป๐จ
Letโs call ๐ the eigenvalues of ๐จ๐ป๐จ, then ๐3) = ๐3
In a similar way,
๐จ๐จ๐ป = ๐ผ ๐บ ๐ฝ๐ป ๐ผ ๐บ ๐ฝ๐ป๐ป
๐ผ ๐บ ๐ฝ๐ป ๐ฝ๐ป ๐ป ๐บ ๐ป๐ผ๐ป = ๐ผ ๐บ ๐ฝ๐ป๐ฝ๐บ๐ป๐ผ๐ป = ๐ผ๐บ ๐บ๐ป๐ผ๐ป
Hence ๐จ๐จ๐ป = ๐ผ ๐บ๐ ๐ผ๐ป
Recall that columns of ๐ผ are all linear independent (orthogonal matrices), then from diagonalization (๐ฉ = ๐ฟ๐ซ๐ฟ1๐), we get:
โข The columns of ๐ผ are the eigenvectors of the matrix ๐จ๐จ๐ป
How can we compute an SVD of a matrix A ?1. Evaluate the ๐ eigenvectors ๐ฏ3 and eigenvalues ๐3 of ๐จ๐ป๐จ2. Make a matrix ๐ฝ from the normalized vectors ๐ฏ3. The columns are called
โright singular vectorsโ.
๐ฝ =โฎ โฆ โฎ๐ฏ& โฆ ๐ฏ'โฎ โฆ โฎ
3. Make a diagonal matrix from the square roots of the eigenvalues.
๐บ =๐&
โฑ๐'
๐3= ๐3 and ๐&โฅ ๐) โฅ ๐*โฆ
4. Find ๐ผ: ๐จ = ๐ผ ๐บ ๐ฝ๐ป โน๐ผ๐บ = ๐จ ๐ฝโน ๐ผ = ๐จ ๐ฝ ๐บ1๐. The columns are called the โleft singular vectorsโ.
True or False?
๐จ has the singular value decomposition ๐จ = ๐ผ ๐บ ๐ฝ๐ป.
โข The matrices ๐ผ and ๐ฝ are not singular
โข The matrix ๐บ can have zero diagonal entries
โข ๐ผ ) = 1
โข The SVD exists when the matrix ๐จ is singular
โข The algorithm to evaluate SVD will fail when taking the square root of a negative eigenvalue
Singular values cannot be negative since ๐จ๐ป๐จ is a positive semi-definite matrix (for real matrices ๐จ)
โข A matrix is positive definite if ๐๐ป๐ฉ๐ > ๐ for โ๐ โ ๐โข A matrix is positive semi-definite if ๐๐ป๐ฉ๐ โฅ ๐ for โ๐ โ ๐
โข What do we know about the matrix ๐จ๐ป๐จ ?๐๐ป ๐จ๐ป๐จ ๐ = (๐จ๐)๐ป๐จ๐ = ๐จ๐ ๐
๐ โฅ 0
โข Hence we know that ๐จ๐ป๐จ is a positive semi-definite matrix
โข A positive semi-definite matrix has non-negative eigenvalues
๐ฉ๐ = ๐๐ โน ๐๐ป๐ฉ๐ = ๐๐ป ๐ ๐ = ๐ ๐ ๐๐ โฅ 0 โน ๐ โฅ 0
Singular values are always non-negative
Cost of SVDThe cost of an SVD is proportional to ๐๐๐ + ๐๐where the constant of proportionality constant ranging from 4 to 10 (or more) depending on the algorithm.
๐ถ456 = ๐ผ ๐ ๐) + ๐* = ๐ ๐*๐ถ.78.78 = ๐*= ๐ ๐*๐ถ9: = 2๐*/3 = ๐ ๐*
SVD summary:โข The SVD is a factorization of a ๐ร๐ matrix into ๐จ = ๐ผ ๐บ ๐ฝ๐ป where ๐ผ is a ๐ร๐
orthogonal matrix, ๐ฝ๐ป is a ๐ร๐ orthogonal matrix and ๐บ is a ๐ร๐ diagonal matrix.
โข In reduced form: ๐จ = ๐ผ๐น๐บ๐น๐ฝ๐น๐ป, where ๐ผ๐น is a ๐ร๐ matrix, ๐บ๐น is a ๐ ร๐ matrix, and ๐ฝ๐น is a ๐ร๐ matrix, and ๐ = min(๐, ๐).
โข The columns of ๐ฝ are the eigenvectors of the matrix ๐จ๐ป๐จ, denoted the right singular vectors.
โข The columns of ๐ผ are the eigenvectors of the matrix ๐จ๐จ๐ป, denoted the left singular vectors.
โข The diagonal entries of ๐บ๐ are the eigenvalues of ๐จ๐ป๐จ. ๐&= ๐& are called the singular values.
โข The singular values are always non-negative (since ๐จ๐ป๐จ is a positive semi-definite matrix, the eigenvalues are always ๐ โฅ 0)
Singular Value Decomposition(applications)
1) Determining the rank of a matrix
๐จ =โฎ โฆ โฎ โฆ โฎ๐" โฆ ๐' โฆ ๐#โฎ โฆ โฎ โฆ โฎ
๐"โฑ
๐'0โฎ0
โฆ ๐ฏ"( โฆโฎ โฎ โฎโฆ ๐ฏ'( โฆ
Suppose ๐จ is a ๐ร๐ rectangular matrix where๐ > ๐:
๐จ ==!"#
$
๐!๐!๐ฏ!%
๐จ# = ๐#๐#๐ฏ#% what is rank ๐จ# = ?
In general, rank ๐จ& = ๐
A) 1B) nC) depends on the matrixD) NOTA
๐จ =โฎ โฆ โฎ๐" โฆ ๐'โฎ โฆ โฎ
โฆ ๐" ๐ฏ"( โฆโฎ โฎ โฎโฆ ๐' ๐ฏ'( โฆ
= ๐"๐"๐ฏ"( + ๐)๐)๐ฏ)( +โฏ+ ๐'๐'๐ฏ'(
Rank of a matrixFor general rectangular matrix ๐จ with dimensions ๐ร๐, the reduced SVD is:
๐จ =I3G&
H
๐3๐3๐ฏ3(
๐จ = ๐ผ๐น๐บ๐น๐ฝ๐น๐ป
๐ร๐ ๐ร๐๐ร๐
๐ ร๐
๐ = min(๐, ๐)
๐ฎ =๐#
โฑ๐&
0โฑ
0 โฆ 0๐ฎ =
๐#โฑ
๐&0 0
โฑ โฎ0
If ๐& โ 0 โ๐, then rank ๐จ = ๐ (Full rank matrix)
In general, rank ๐จ = ๐, where ๐ is the number of non-zero singular values ๐&
๐ < ๐ (Rank deficient)
โข The rank of A equals the number of non-zero singular values which is the same as the number of non-zero diagonal elements in ฮฃ.
โข Rounding errors may lead to small but non-zero singular values in a rank deficient matrix, hence the rank of a matrix determined by the number of non-zero singular values is sometimes called โeffective rankโ.
โข The right-singular vectors (columns of ๐ฝ) corresponding to vanishing singular values span the null space of A.
โข The left-singular vectors (columns of ๐ผ) corresponding to the non-zero singular values of A span the range of A.
Rank of a matrix
2) Pseudo-inverseโข Problem: if A is rank-deficient, ๐บ is not be invertible
โข How to fix it: Define the Pseudo Inverse
โข Pseudo-Inverse of a diagonal matrix:
๐บN 3 = N&O&, if ๐3 โ 0
0, if ๐3 = 0
โข Pseudo-Inverse of a matrix ๐จ:
๐จN = ๐ฝ๐บN๐ผ๐ป
3) Matrix normsThe Euclidean norm of an orthogonal matrix is equal to 1
๐ผ ) = max๐ !+"
๐ผ๐ ) = max๐ !+"
๐ผ๐ ๐ป(๐ผ๐)= max๐ !+"
๐๐ป๐ = max๐ !+"
๐ ) = 1
The Euclidean norm of a matrix is given by the largest singular value
๐จ ) = max๐ !+"
๐จ๐ ) = max๐ !+"
๐ผ ๐บ ๐ฝ๐ป๐ ) = max๐ !+"
๐บ ๐ฝ๐ป๐ ) =
= max๐ฝ๐ป๐ !+"
๐บ ๐ฝ๐ป๐ ) = max๐ !+"
๐บ ๐ ) =max(๐&)
Where we used the fact that ๐ผ ) = 1, ๐ฝ ) = 1 and ๐บ is diagonal
๐จ ) = max ๐& = ๐#./ ๐'() is the largest singular value
4) Norm for the inverse of a matrixThe Euclidean norm of the inverse of a square-matrix is given by:
Assume here ๐จ is full rank, so that ๐จ1& exists
๐จ0" ) = max๐ !+"
(๐ผ ๐บ ๐ฝ๐ป)0"๐ ) = max๐ !+"
๐ฝ ๐บ0๐๐ผ๐ป๐ )
Since ๐ผ ) = 1, ๐ฝ ) = 1 and ๐บ is diagonal then
๐จ0" )="
2#$%๐'!$ is the smallest singular value
5) Norm of the pseudo-inverse matrixThe norm of the pseudo-inverse of a ๐ ร ๐ matrix is:
๐จ3 )="2&
where ๐4 is the smallest non-zero singular value. This is valid for any matrix, regardless of the shape or rank.
Note that for a full rank square matrix, ๐จ3 ) is the same as ๐จ0" ).
Zero matrix: If ๐จ is a zero matrix, then ๐จ3 is also the zero matrix, and ๐จ3 )= 0
The condition number of a matrix is given by
๐๐๐๐) ๐จ = ๐จ ) ๐จ3 )
If the matrix is full rank: ๐๐๐๐ ๐จ = ๐๐๐ ๐, ๐
๐๐๐๐) ๐จ =๐#./๐#&'
where ๐#./ is the largest singular value and ๐#&' is the smallest singular value
If the matrix is rank deficient: ๐๐๐๐ ๐จ < ๐๐๐ ๐, ๐
๐๐๐๐) ๐จ = โ
6) Condition number of a matrix
7) Low-Rank ApproximationAnother way to write the SVD (assuming for now ๐ > ๐ for simplicity)
๐จ =โฎ โฆ โฎ๐# โฆ ๐'โฎ โฆ โฎ
๐#โฑ
๐$0โฎ0
โฆ ๐ฏ#% โฆโฎ โฎ โฎโฆ ๐ฏ$% โฆ
=โฎ โฆ โฎ๐# โฆ ๐$โฎ โฆ โฎ
โฆ ๐# ๐ฏ#% โฆโฎ โฎ โฎโฆ ๐$ ๐ฏ$% โฆ
= ๐#๐#๐ฏ#% + ๐*๐*๐ฏ*% +โฏ+ ๐$๐$๐ฏ$%
The SVD writes the matrix A as a sum of outer products (of left and right singular vectors).
๐" โฅ ๐) โฅ ๐*โฆ โฅ 0
๐จH = ๐&๐&๐ฏ&( + ๐)๐)๐ฏ)( +โฏ+ ๐H๐H๐ฏH(
Note that ๐๐๐๐ ๐จ = ๐ and ๐๐๐๐(๐จH) = ๐ and the norm of the difference between the matrix and its approximation is
The best rank-๐ approximation for a ๐ร๐ matrix ๐จ, (where ๐โค ๐๐๐(๐, ๐)) is the one that minimizes the following problem:
When using the induced 2-norm, the best rank-๐ approximation is given by:
7) Low-Rank Approximation (cont.)
๐จ โ ๐จ& * = ๐&+#๐&+#๐ฏ&+#% + ๐&+*๐&+*๐ฏ&+*% +โฏ+ ๐$๐$๐ฏ$% * = ๐&+#
Example: Image compression
๐๐๐
๐๐๐
๐๐๐
๐๐๐๐
๐๐๐๐
๐๐๐๐
๐๐๐
๐๐๐๐
Example: Image compression
๐๐๐
๐๐๐๐
Image using rank-50 approximation
8) Using SVD to solve square system of linear equations
If ๐จ is a ๐ร๐ square matrix and we want to solve ๐จ ๐ = ๐, we can use the SVD for ๐จ such that
๐ผ ๐บ ๐ฝ๐ป๐ = ๐๐บ ๐ฝ๐ป๐ = ๐ผ๐ป๐
Solve: ๐บ ๐ = ๐ผ๐ป๐ (diagonal matrix, easy to solve!)Evaluate: ๐ = ๐ฝ ๐
Cost of solve: ๐ ๐(Cost of decomposition ๐ ๐) (recall that SVD and LU have the same cost asymptotic behavior, however the number of operations - constant factor before ๐) - for the SVD is larger than LU)