Principal Component Analysis. Consider a collection of points

Preview:

Citation preview

Principal Component Analysis

Consider a collection of points

Suppose you want to fit a line

Consider variance ofdistribution on the line

Project onto the Line

different variance

Different line . . .

Maximum Variance

Minimum Variance

Given by eigenvectorsof covariance matrixof coordinatesof original points

PCA notes…

• Input data set• Subtract the mean to get data set with 0-

mean• Compute the covariance matrix• Compute the eigenvalues and

eigenvectors of the covariance matrix• Choose components and form a feature

vector. Order by eigenvalues – highest to lowest

PCA

• To compress, ignore components of lesser significance

• The feature vector F is a matrix is the matrix of ordered eigenvectors

• Derive the data set in the new coordinates:

• new_data = FT old_data

Covariance

• C, of 2 random variables X and Y

),cov(),cov(),cov(

),cov(),cov(),cov(

),cov(),cov(),cov(

zzzyzx

zyyyyx

zxyxxx

C

1

))((),cov( 1

n

yyxxYX

n

iii

where

Example

Choose bounding boxoriented this way

OOBB

OOBB: Fitting

Covariance matrix ofpoint coordinates describesstatistical spread of cloud.

OBB is aligned with directions ofgreatest and least spread (which are guaranteed to be orthogonal).

Good Box

OOBB

Add points:worse Box

OOBB

More points:terrible box

OOBB

OOBB

Recommended