Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
2. Mathematical Preliminaries2. Mathematical Preliminaries2. Mathematical Preliminaries
IntroductionPattern Recognition:
2
Mathematical PreliminariesMathematicalMathematical PreliminariesPreliminaries
• Random Variables
• Linear Transformations
• Eigenvalues and Eigenvectors
• Orthonormal Transformations
• Matrix Inversion
IntroductionPattern Recognition:
3
Random VariablesRandomRandom VariablesVariables
In statistical pattern recognition, a pattern (input to a PR system) is a d-dimensional feature vector (random variable)
IntroductionPattern Recognition:
4
Random VariablesRandomRandom VariablesVariables
Is fully characterized by its (cumulative) distribution function:
or density function:
IntroductionPattern Recognition:
5
Random VariablesRandomRandom VariablesVariables
• In pattern recognition, we deal with random vectors drawn from different classes• Conditional density of class i (L classes):
• Unconditional density function of x (mixture density function):
is a priori probability of class i
IntroductionPattern Recognition:
6
Random VariablesRandomRandom VariablesVariables
• A posteriori probability of wi given x (Bayes theorem):
IntroductionPattern Recognition:
7
Random VariablesRandomRandom VariablesVariables
• Expected vector (mean):
IntroductionPattern Recognition:
8
Random VariablesRandomRandom VariablesVariables
• Covariance matrix (indicates the dispersion of the distribution):
is
the variance, is
the standard deviation
of
is
the correlation
coefficient between
and
IntroductionPattern Recognition:
9
Random VariablesRandomRandom VariablesVariables
is
the autocorrelation
function
IntroductionPattern Recognition:
10
Random VariablesRandomRandom VariablesVariables
Gaussian (normal) distribution: describes data that cluster around a mean or average
1-d
d>=1
distance function
IntroductionPattern Recognition:
11
Random VariablesRandomRandom VariablesVariables
Normal distribution:• is uniquely characterized by the expected vector and covariance matrix
• The assumption of normality is a reasonable approximation for many real data sets
• However, normality should not be assumed without good justification
• More about normal distribution
IntroductionPattern Recognition:
12
Linear TransformationsLinearLinear TransformationsTransformations
IntroductionPattern Recognition:
13
Linear TransformationsLinearLinear TransformationsTransformations
IntroductionPattern Recognition:
14
Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors
-
eigenvectors
-
eigenvalues
Write
for all i:
IntroductionPattern Recognition:
15
Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors
eigenvalue
matrix
eigenvector
matrix
IntroductionPattern Recognition:
16
Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors
The eigenvectors
corresponding
to two different
eigenvalues
are orthogonal:
IntroductionPattern Recognition:
17
Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors
IntroductionPattern Recognition:
18
Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors
There is no correlation in the transformedspace:
IntroductionPattern Recognition:
19
Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectorsEigenvectors
are the principal
axis of the distribution
• We
rotate
the coordinate
system
• There is
no correlation
between
y1
and y2
• λ1
gives
the variance of y1
, λ2 gives
the variance of y2
IntroductionPattern Recognition:
20
WhiteningWhiteningWhitening
After applying the orthonormal transformation, we can add another transformation that will make the covariance matrix equal to I:
Purpose: to change the scales
of principal components in proportion to
IntroductionPattern Recognition:
21
Simultaneous DiagonalizationSimultaneousSimultaneous DiagonalizationDiagonalization
Goal: diagonalize and simultaneously by a linear transformation:
IntroductionPattern Recognition:
22
Orthonormal TransformationsOrthonormalOrthonormal TransformationsTransformations
Properties:a) Eigenvalues are invariant
IntroductionPattern Recognition:
23
Orthonormal TransformationsOrthonormalOrthonormal TransformationsTransformations
IntroductionPattern Recognition:
24
Orthnormal TransformationsOrthnormalOrthnormal TransformationsTransformations
b) Euclidean distance is invariant
Distance in Y-space is the same as inX-space
IntroductionPattern Recognition:
25
Trace of Covariance MatrixTrace of Covariance Trace of Covariance MatrixMatrix
Let’s look at
Sum of diagonal terms of
The trace of is the summation of all eigenvalues and is invariant under anyorthonormal transformation
IntroductionPattern Recognition:
26
Determinant of Covariance MatrixDeterminantDeterminant of Covariance of Covariance MatrixMatrix
The determinant of is equal to the product of all eigenvalues and is invariant under any orthonormal transformation
IntroductionPattern Recognition:
27
Rank of Covariance MatrixRank of Covariance Rank of Covariance MatrixMatrix
Rank of is equal to the number of nonzero eigenvalues
Relation between |S| and :
IntroductionPattern Recognition:
28
Matrix InversionMatrixMatrix InversionInversion
Introduction to matrix inversion: 1 and 2
IntroductionPattern Recognition:
29
Generalized Inverse (Pseudo Inverse)GeneralizedGeneralized Inverse (Pseudo Inverse)Inverse (Pseudo Inverse)
IntroductionPattern Recognition:
30
Generalized Inverse (Pseudo Inverse)GeneralizedGeneralized Inverse (Pseudo Inverse)Inverse (Pseudo Inverse)
However, if matrix is singular someeigenvectors are zero
Generalized (pseudo) inverse:
But: