2. Mathematical Preliminaries · 2009-10-06 · Relation between |S| and : Pattern ... Introduction...

Preview:

Citation preview

2. Mathematical Preliminaries2. Mathematical Preliminaries2. Mathematical Preliminaries

IntroductionPattern Recognition:

2

Mathematical PreliminariesMathematicalMathematical PreliminariesPreliminaries

• Random Variables

• Linear Transformations

• Eigenvalues and Eigenvectors

• Orthonormal Transformations

• Matrix Inversion

IntroductionPattern Recognition:

3

Random VariablesRandomRandom VariablesVariables

In statistical pattern recognition, a pattern (input to a PR system) is a d-dimensional feature vector (random variable)

IntroductionPattern Recognition:

4

Random VariablesRandomRandom VariablesVariables

Is fully characterized by its (cumulative) distribution function:

or density function:

IntroductionPattern Recognition:

5

Random VariablesRandomRandom VariablesVariables

• In pattern recognition, we deal with random vectors drawn from different classes• Conditional density of class i (L classes):

• Unconditional density function of x (mixture density function):

is a priori probability of class i

IntroductionPattern Recognition:

6

Random VariablesRandomRandom VariablesVariables

• A posteriori probability of wi given x (Bayes theorem):

IntroductionPattern Recognition:

7

Random VariablesRandomRandom VariablesVariables

• Expected vector (mean):

IntroductionPattern Recognition:

8

Random VariablesRandomRandom VariablesVariables

• Covariance matrix (indicates the dispersion of the distribution):

is

the variance, is

the standard deviation

of

is

the correlation

coefficient between

and

IntroductionPattern Recognition:

9

Random VariablesRandomRandom VariablesVariables

is

the autocorrelation

function

IntroductionPattern Recognition:

10

Random VariablesRandomRandom VariablesVariables

Gaussian (normal) distribution: describes data that cluster around a mean or average

1-d

d>=1

distance function

IntroductionPattern Recognition:

11

Random VariablesRandomRandom VariablesVariables

Normal distribution:• is uniquely characterized by the expected vector and covariance matrix

• The assumption of normality is a reasonable approximation for many real data sets

• However, normality should not be assumed without good justification

• More about normal distribution

IntroductionPattern Recognition:

12

Linear TransformationsLinearLinear TransformationsTransformations

IntroductionPattern Recognition:

13

Linear TransformationsLinearLinear TransformationsTransformations

IntroductionPattern Recognition:

14

Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors

-

eigenvectors

-

eigenvalues

Write

for all i:

IntroductionPattern Recognition:

15

Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors

eigenvalue

matrix

eigenvector

matrix

IntroductionPattern Recognition:

16

Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors

The eigenvectors

corresponding

to two different

eigenvalues

are orthogonal:

IntroductionPattern Recognition:

17

Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors

IntroductionPattern Recognition:

18

Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectors

There is no correlation in the transformedspace:

IntroductionPattern Recognition:

19

Eigenvalues and EigenvectorsEigenvaluesEigenvalues and and EigenvectorsEigenvectorsEigenvectors

are the principal

axis of the distribution

• We

rotate

the coordinate

system

• There is

no correlation

between

y1

and y2

• λ1

gives

the variance of y1

, λ2 gives

the variance of y2

IntroductionPattern Recognition:

20

WhiteningWhiteningWhitening

After applying the orthonormal transformation, we can add another transformation that will make the covariance matrix equal to I:

Purpose: to change the scales

of principal components in proportion to

IntroductionPattern Recognition:

21

Simultaneous DiagonalizationSimultaneousSimultaneous DiagonalizationDiagonalization

Goal: diagonalize and simultaneously by a linear transformation:

IntroductionPattern Recognition:

22

Orthonormal TransformationsOrthonormalOrthonormal TransformationsTransformations

Properties:a) Eigenvalues are invariant

IntroductionPattern Recognition:

23

Orthonormal TransformationsOrthonormalOrthonormal TransformationsTransformations

IntroductionPattern Recognition:

24

Orthnormal TransformationsOrthnormalOrthnormal TransformationsTransformations

b) Euclidean distance is invariant

Distance in Y-space is the same as inX-space

IntroductionPattern Recognition:

25

Trace of Covariance MatrixTrace of Covariance Trace of Covariance MatrixMatrix

Let’s look at

Sum of diagonal terms of

The trace of is the summation of all eigenvalues and is invariant under anyorthonormal transformation

IntroductionPattern Recognition:

26

Determinant of Covariance MatrixDeterminantDeterminant of Covariance of Covariance MatrixMatrix

The determinant of is equal to the product of all eigenvalues and is invariant under any orthonormal transformation

IntroductionPattern Recognition:

27

Rank of Covariance MatrixRank of Covariance Rank of Covariance MatrixMatrix

Rank of is equal to the number of nonzero eigenvalues

Relation between |S| and :

IntroductionPattern Recognition:

28

Matrix InversionMatrixMatrix InversionInversion

Introduction to matrix inversion: 1 and 2

IntroductionPattern Recognition:

29

Generalized Inverse (Pseudo Inverse)GeneralizedGeneralized Inverse (Pseudo Inverse)Inverse (Pseudo Inverse)

IntroductionPattern Recognition:

30

Generalized Inverse (Pseudo Inverse)GeneralizedGeneralized Inverse (Pseudo Inverse)Inverse (Pseudo Inverse)

However, if matrix is singular someeigenvectors are zero

Generalized (pseudo) inverse:

But:

Recommended