Acknowledgements

BioinformaticsBioinformaticsOther data reduction techniquesOther data reduction techniques

Kristel Van Steen, PhD, ScDKristel Van Steen, PhD, ScD

([email protected])([email protected])

Université de Liege - Institut MontefioreUniversité de Liege - Institut Montefiore

2008-20092008-2009

AcknowledgementsAcknowledgements

Material based on: Material based on:

work from Pradeep Mummidiwork from Pradeep Mummidi

class notes from class notes from Christine SteinhoffChristine Steinhoff

Outline

Intuition behind PCA Theory behind PCA Applications of PCA Extensions of PCA

Multidimensional scaling MDS (not to be confused with MDR)

Intuition behind PCAIntuition behind PCA

IntroductionIntroduction

Most of the scientific or industrial data is Most of the scientific or industrial data is Multivariate data (huge size of data)Multivariate data (huge size of data)

Is all the data useful?Is all the data useful?

If not, how do we quickly extract useful If not, how do we quickly extract useful information only?information only?

ProblemProblem

When we use traditional techniques,When we use traditional techniques, 1. Not easy to extract useful information from the 1. Not easy to extract useful information from the

multivariate datamultivariate data

1) Many bivariate plots are needed1) Many bivariate plots are needed

2) Bivariate plots, however, mainly represent 2) Bivariate plots, however, mainly represent correlations between variables (not samples).correlations between variables (not samples).

Visualization ProblemVisualization Problem

Not easy to visualize multivariate dataNot easy to visualize multivariate data - 1D: dot- 1D: dot

- 2D: Bivariate plot (i.e. X-Y plane)- 2D: Bivariate plot (i.e. X-Y plane)

- 3D: X-Y-Z plot - 3D: X-Y-Z plot

- 4D: ternary plot with a color code /Tetrahedron- 5D, - 4D: ternary plot with a color code /Tetrahedron- 5D, 6D, etc. : ???6D, etc. : ???

Visualization?????Visualization?????

As the number of variables increases, data space becomes harder to visualizeAs the number of variables increases, data space becomes harder to visualize

Basics of PCABasics of PCA

PCA is useful when we need to extract useful PCA is useful when we need to extract useful information from multivariate data sets.information from multivariate data sets.

This technique is based on the reduced This technique is based on the reduced dimensionality.dimensionality.

Therefore, trends in multivariate data are easily Therefore, trends in multivariate data are easily visualized.visualized.

Variable Reduction Variable Reduction ProcedureProcedure

Principal component analysis is a variable reduction Principal component analysis is a variable reduction procedure. It is useful when you have obtained data on a procedure. It is useful when you have obtained data on a number of variables (possibly a large number of variables), number of variables (possibly a large number of variables), and believe that there is some redundancy in those variablesand believe that there is some redundancy in those variables

Redundancy means that some of the variables are correlated Redundancy means that some of the variables are correlated with one another, possibly because they are measuring the with one another, possibly because they are measuring the same construct.same construct.

Because of this redundancy, you believe that it should be Because of this redundancy, you believe that it should be

possible to reduce the observed variables into a smaller possible to reduce the observed variables into a smaller number of principal components (artificial variables) that will number of principal components (artificial variables) that will account for most of the variance in the observed variables.account for most of the variance in the observed variables.

What is Principal What is Principal ComponentComponent

A A principal component principal component can be defined as a linear can be defined as a linear combination of optimally-weighted observed variables.combination of optimally-weighted observed variables.

Based on how subject scores on a principal component Based on how subject scores on a principal component are are

computed.computed.

7 Item measure of Job 7 Item measure of Job SatisfactionSatisfaction

General FormulaGeneral Formula

Below is the general form for the formula to compute Below is the general form for the formula to compute scores on the first component extracted (created) in a scores on the first component extracted (created) in a principal component analysis:principal component analysis:

C1 = b 11(X1) + b12(X 2) + ... b1p(Xp)C1 = b 11(X1) + b12(X 2) + ... b1p(Xp) wherewhere

C1 = the subject’s score on principal component 1 (the first C1 = the subject’s score on principal component 1 (the first component extracted)component extracted)

b1p = the regression coefficient (or weight) for observed b1p = the regression coefficient (or weight) for observed variable p, as used invariable p, as used in

creating principal component 1creating principal component 1 Xp = the subject’s score on observed variable p.Xp = the subject’s score on observed variable p.

For example, assume that component 1 in the present For example, assume that component 1 in the present study was the “satisfaction with supervision” study was the “satisfaction with supervision” component. You could determine each subject’s score component. You could determine each subject’s score on principal component 1 byon principal component 1 by

using the following fictitious formula:using the following fictitious formula:

C1 = .44 (X1) + .40 (X2) + .47 (X3) + .32 (X4)+ .02 C1 = .44 (X1) + .40 (X2) + .47 (X3) + .32 (X4)+ .02 (X5) + .01 (X6) + .03 (X7)(X5) + .01 (X6) + .03 (X7)

Obviously, a different equation, with different Obviously, a different equation, with different regression weights, would be used to compute subject regression weights, would be used to compute subject scores on component 2 (the satisfaction with pay scores on component 2 (the satisfaction with pay component). Below is a fictitious illustration of this component). Below is a fictitious illustration of this formula:formula:

C2 = .01 (X1) + .04 (X2) + .02 (X3) + .02 (X4)+ .48 C2 = .01 (X1) + .04 (X2) + .02 (X3) + .02 (X4)+ .48 (X5) + .31 (X6) + .39 (X7)(X5) + .31 (X6) + .39 (X7)

Number of components Number of components ExtractedExtracted

If a principal component analysis were performed on If a principal component analysis were performed on data from the 7-item job satisfaction questionnaire, data from the 7-item job satisfaction questionnaire, only two components was created. However, such an only two components was created. However, such an impression would not be entirely correct.impression would not be entirely correct.

In reality, the number of components extracted in a In reality, the number of components extracted in a principal component analysis is equal to the number principal component analysis is equal to the number of observed variables being analyzed. of observed variables being analyzed.

However, in most analyses, only the first few However, in most analyses, only the first few components account for meaningful amounts of components account for meaningful amounts of variance, so only these first few components are variance, so only these first few components are retained, interpreted, and used in subsequent retained, interpreted, and used in subsequent analyses (such as in multiple regression analyses).analyses (such as in multiple regression analyses).

Characteristics of principal Characteristics of principal componentscomponents

The first component extracted in a principal component The first component extracted in a principal component analysis accounts for a maximal amount of total variance analysis accounts for a maximal amount of total variance in the observed variables.in the observed variables.

Under typical conditions, this means that the first Under typical conditions, this means that the first component will be correlated with at least some of the component will be correlated with at least some of the observed variables. It may be correlated with many.observed variables. It may be correlated with many.

The second component extracted will have two The second component extracted will have two important characteristics. First, this component will important characteristics. First, this component will account for a maximal amount of variance in the data account for a maximal amount of variance in the data set that was not accounted for by the first component. set that was not accounted for by the first component.

Under typical conditions, this means that the second Under typical conditions, this means that the second component will be correlated with some of the observed component will be correlated with some of the observed variables that did not display strong correlations with variables that did not display strong correlations with component 1.component 1.

The second characteristic of the second component is The second characteristic of the second component is that it will be that it will be uncorrelated uncorrelated with the first component. with the first component. Literally, if you were to compute the correlation between Literally, if you were to compute the correlation between components 1 and 2, that correlation would be zero.components 1 and 2, that correlation would be zero.

The remaining components that are extracted in the The remaining components that are extracted in the analysis display the same two characteristics: each analysis display the same two characteristics: each component accounts for a maximal amount of variance component accounts for a maximal amount of variance in the observed variables that was not accounted for by in the observed variables that was not accounted for by the preceding components, and is uncorrelated with all the preceding components, and is uncorrelated with all of the preceding components.of the preceding components.

GeneralizationGeneralization

A principal component analysis proceeds in this A principal component analysis proceeds in this

fashion, with each new component accounting for fashion, with each new component accounting for progressively smaller and smaller amounts of variance progressively smaller and smaller amounts of variance (this is why only the first few components are usually (this is why only the first few components are usually retained and interpreted).retained and interpreted).

When the analysis is complete, the resulting When the analysis is complete, the resulting components will display varying degrees of correlation components will display varying degrees of correlation with the observed variables, but are completely with the observed variables, but are completely uncorrelated with one another.uncorrelated with one another.

ReferencesReferences

http://support.sas.com/publishing/pubcat/chaps/55129.pdf

http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf

http://www.cis.hut.fi/jhollmen/dippa/node30.html








Theory behind PCATheory behind PCA

Theory behind PCATheory behind PCALinear Algebra

OUTLINEOUTLINE

What do we need from „linear algebra“ for understanding What do we need from „linear algebra“ for understanding

principal component analysis ?principal component analysis ?

•Standard deviation, Variance, CovarianceStandard deviation, Variance, Covariance

•The Covariance matrixThe Covariance matrix

•Symmetric matrix and orthogonalitySymmetric matrix and orthogonality

•Eigenvalues and EigenvectorsEigenvalues and Eigenvectors

•PropertiesProperties

Motivation Motivation

Motivation Motivation

Protein1Protein1

Pro

tein

2Pro

tein

2

Proteins 1 and 2 measured for 200 patientsProteins 1 and 2 measured for 200 patients

MotivationMotivation

GenesGenes

11

22,00022,000

Patients 1 200Patients 1 200

Microarray ExperimentMicroarray Experiment

? Visualize ?? Visualize ?

? Which genes are important ?? Which genes are important ?

? For which subgroup of patients ?? For which subgroup of patients ?

MotivationMotivation

Patients 1 10Patients 1 10

GenesGenes

11

200200

Basics for Principal Component AnalysisBasics for Principal Component Analysis

•Orthogonal/OrthonormalOrthogonal/Orthonormal

•Some Theorems...Some Theorems...

•Standard deviation, Variance, CovarianceStandard deviation, Variance, Covariance

•The Covariance matrixThe Covariance matrix

•Eigenvalues and EigenvectorsEigenvalues and Eigenvectors

Standard DeviationStandard Deviation

The average distance from the mean of the data set to a pointThe average distance from the mean of the data set to a point

MEAN:MEAN:

Example:Example:

Measurement 1: 0,8,12,20Measurement 1: 0,8,12,20


M1M1 M2M2

Mean 10Mean 10 Mean 10Mean 10

SD 8.33SD 8.33 SD 1.83SD 1.83

VarianceVariance

Example:Example:



M1M1 M2M2

Mean 10Mean 10 Mean 10Mean 10

SD 8.33SD 8.33 SD 1.83SD 1.83

Var 69.33Var 69.33 Var 3.33Var 3.33

CovarianceCovariance

Standard Deviation and Variance are 1-dimensionalStandard Deviation and Variance are 1-dimensional

How much do the dimensions vary from the mean with respect to each other ?How much do the dimensions vary from the mean with respect to each other ?

Covariance measures between 2 dimensionsCovariance measures between 2 dimensions

We easily see, if X=Y we end up with varianceWe easily see, if X=Y we end up with variance

Covariance MatrixCovariance Matrix

Let XLet X be a random vector. be a random vector.

Then the covariance matrix of XThen the covariance matrix of X,, denoted by Cov(X) denoted by Cov(X),, is is

The diagonals of Cov(X) The diagonals of Cov(X) are are ..

In matrix notation, In matrix notation,

The covariance matrix is The covariance matrix is symmetricsymmetric

Symmetric MatrixSymmetric Matrix

Let be a square matrix of size nxn. The matrix A is symmetric, ifLet be a square matrix of size nxn. The matrix A is symmetric, if

for all for all

Orthogonality/OrthonormalityOrthogonality/Orthonormality

0.5 1.0 1.50.5 1.0 1.5

1.51.5

11

0.50.5

<v1,v2> = <(1 0),(0 1)><v1,v2> = <(1 0),(0 1)>

= 0= 0

Unit vectors which are orthogonal are said to be orthonormal. Unit vectors which are orthogonal are said to be orthonormal.

Two vectors v1 and v2 for which <v1,v2>=0 holds are said to be orthogonal Two vectors v1 and v2 for which <v1,v2>=0 holds are said to be orthogonal

Eigenvalues/EigenvectorsEigenvalues/Eigenvectors

Let A be an nxn square matrix and x an nx1 column vector. Then a (right) Let A be an nxn square matrix and x an nx1 column vector. Then a (right)

eigenvector of A is a nonzero vector x such that:eigenvector of A is a nonzero vector x such that:

For some scalarFor some scalar

EigenvalueEigenvalue EigenvectorEigenvector

Procedure: Procedure:

Finding the eigenvaluesFinding the eigenvalues

=0=0 Finding lambdas Finding lambdas

Finding corresponding eigenvectors Finding corresponding eigenvectors

R: R: eigen(matrix)eigen(matrix)

Matlab: Matlab: eig(matrix)eig(matrix)

Some RemarksSome Remarks

If If AA and and BB are matrices whose sizes are such that the given operations are are matrices whose sizes are such that the given operations are

defined and defined and cc is any scalar then, is any scalar then,

( )

( )

( )

( )

t t

t t t

t t

t t t

A A

A B A B

cA cA

AB B A

Now,…Now,…

We have enough definitions to go into the procedure how toWe have enough definitions to go into the procedure how to

perform Principal Component Analysisperform Principal Component Analysis

Theory behind PCATheory behind PCALinear algebra applied

OUTLINEOUTLINE

What is principal component analysis good for?What is principal component analysis good for?

Principal Component Analysis: PCAPrincipal Component Analysis: PCA

•The basic Idea of Principal Component AnalysisThe basic Idea of Principal Component Analysis

•The idea of transformationThe idea of transformation

•How to get there ? The mathematics partHow to get there ? The mathematics part

•Some remarksSome remarks

•Basic algorithmic procedureBasic algorithmic procedure

Idea of PCAIdea of PCA

•Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of

multivariate data in terms of a set of uncorrelated variablesmultivariate data in terms of a set of uncorrelated variables

•We typically have a data matrix of We typically have a data matrix of nn observations on observations on pp correlated variables correlated variables x1,x2,…xpx1,x2,…xp

•PCA looks for a transformation of the PCA looks for a transformation of the xi xi into into pp new variables new variables yiyi that are uncorrelated that are uncorrelated

IdeaIdea

GenesGenes

x1x1

xpxp

Patients 1 Patients 1

nn

Dimension highDimension high

So how can we reduce the dimension ? So how can we reduce the dimension ?

Simplest way: take the first one, two, three;Simplest way: take the first one, two, three;

Plot and discard the rest:Plot and discard the rest:

Obviously a very bad idea.Obviously a very bad idea.

Matrix: XMatrix: X

TransformationTransformation

We want to find a transformation that involves ALL columns, not only the first We want to find a transformation that involves ALL columns, not only the first

onesones

So find a new basis, order it such that in the first component lies almost ALL So find a new basis, order it such that in the first component lies almost ALL

information of the whole datasetinformation of the whole datasetLooking for a transformation of the data matrix Looking for a transformation of the data matrix XX ( (ppxxnn) such that) such that

Y= Y= TT XX==1 X1+ 1 X1+ 2 X2+..+ 2 X2+..+ p Xpp Xp


Maximize the variance of the projection of the observations on the Y variables !Maximize the variance of the projection of the observations on the Y variables !

Find Find such thatsuch that

Var(Var(T T

XX) is maximal) is maximal

The matrix The matrix C=Var(X)C=Var(X) is the covariance matrix of the is the covariance matrix of the Xi Xi variablesvariables

What is a reasonable choice for the What is a reasonable choice for the ? ?

Remember: Remember: We wanted a transformation that maximizes „information“ We wanted a transformation that maximizes „information“

That means: captures „Variance in the data“That means: captures „Variance in the data“


Can we intuitively see that in a picture?Can we intuitively see that in a picture?

GoodGood BetterBetter


PC1PC1PC2PC2

OrthogonalityOrthogonality

How do we get there?How do we get there?

GenesGenes

x1x1

xpxp

Patients 1 Patients 1

nn

X is a real valued pxn matrixX is a real valued pxn matrix

Cov(X) is a real value pxp matrix or nxn matrixCov(X) is a real value pxp matrix or nxn matrix

-> decide whether you want to analyse patient groups-> decide whether you want to analyse patient groups

Or do you want to analyse gene groups?Or do you want to analyse gene groups?

)(..........

........)(

........)(

21

2221

1211

ppp

p

p

xv),xc(x),xc(x

),xc(xxv),xc(x

),xc(x),xc(xxv

Cov(X)=Cov(X)=

Lets decide for genes:Lets decide for genes:


How do we get thereHow do we get there

Some Features on Cov(X)Some Features on Cov(X)

•Cov(X) is a symmetric pxp matrixCov(X) is a symmetric pxp matrix

•The diagonal terms of Cov(X) are the variance genes across patientsThe diagonal terms of Cov(X) are the variance genes across patients

•The off-diagonal terms of Cov(X) are the covariance between gene vectorsThe off-diagonal terms of Cov(X) are the covariance between gene vectors

•Cov(X) captures the correlations between all possible pairs of measurementsCov(X) captures the correlations between all possible pairs of measurements

•In the diagonal terms, by assumption, large values correspond to interesting dynamicsIn the diagonal terms, by assumption, large values correspond to interesting dynamics

•In the off diagonal terms large values correspond to high redundancyIn the off diagonal terms large values correspond to high redundancy


The principal Components of X are the Eigenvectors of Cov(X)The principal Components of X are the Eigenvectors of Cov(X)

Assume, we can „manipulate“ X a bit: Lets call this YAssume, we can „manipulate“ X a bit: Lets call this Y

Y should be manipulated in a way that it is a bit more optimal than X wasY should be manipulated in a way that it is a bit more optimal than X was

What does optimal mean?What does optimal mean?

That means: That means:

VarVarVarVar

VarVar

CovCov

LARGE!LARGE!

SMALL!SMALL!

In other words: should be diagonal and large values on the diagonalIn other words: should be diagonal and large values on the diagonal


The manipulation is a change of the basis with orthonormal vectors The manipulation is a change of the basis with orthonormal vectors

And they are ordered in a way that the most important comes first (principal) ...And they are ordered in a way that the most important comes first (principal) ...

How do we put this in mathematical terms?How do we put this in mathematical terms?Find orthonormal P such that Find orthonormal P such that

Y = P X Y = P X With Cov(Y) diagonalizedWith Cov(Y) diagonalized

Then the rows of P are the principal components of XThen the rows of P are the principal components of X


1( ) ( )( )

1tCov Y PX PX

n

1

1t tPXX P

n

1( )

1t tP XX P

n

1

1tPAP

n

Y PX Cov(Y) = 1/(n-1) YY Cov(Y) = 1/(n-1) YY tt

A:=XX A:=XX tt


tA EDE

: tP E

tA P DP

A is symmetricA is symmetric

Therefore there is a matrix E of eigenvectors and a diagonal matrix D such that:Therefore there is a matrix E of eigenvectors and a diagonal matrix D such that:

Now define P to be the transpose of the matrix E of eigenvectorsNow define P to be the transpose of the matrix E of eigenvectors

Then we can write A:Then we can write A:


Now we can go back to our Covariance Expression:Now we can go back to our Covariance Expression:

1

1tPAP

n

Cov(Y)Cov(Y)

1( ) ( )

1t tCov Y P P DP P

n

1( ) ( )1

t tPP D PPn


1 tP P

1 11( ) ( )1PP D PP

n

1

1D

n

The inverse of an orthogonal matrix is its transpose (due to its definition): The inverse of an orthogonal matrix is its transpose (due to its definition):

In our context that means:In our context that means:

Cov(Y)Cov(Y)


P diagonalizes Cov(Y)P diagonalizes Cov(Y)

Where P is the transpose of the matrix of Eigenvectors of XX Where P is the transpose of the matrix of Eigenvectors of XX tt

The principal components of X are the eigenvectors of XX The principal components of X are the eigenvectors of XX tt

(thats the same as the rows of P)(thats the same as the rows of P)

The ith diagonal value of Cov(Y) is the variance of X along pi (=along the ith principal)The ith diagonal value of Cov(Y) is the variance of X along pi (=along the ith principal)

Essentially we need to compute Essentially we need to compute

EIGENVALUES and EIGENVECTORSEIGENVALUES and EIGENVECTORS

Explained varianceExplained variance Principal componentsPrincipal components

Of the covariance matrix of the original matrix XOf the covariance matrix of the original matrix X


•If you multiply one variable by a scalar you get different results If you multiply one variable by a scalar you get different results

•This is because it uses covariance matrix (and not correlation)This is because it uses covariance matrix (and not correlation)

•PCA should be applied on data that have approximately the same scale in each variablePCA should be applied on data that have approximately the same scale in each variable

•The relative variance explained by each PC is given by The relative variance explained by each PC is given by eigenvalue/sum(eigenvalues)eigenvalue/sum(eigenvalues)

• When to stop? For example: Enough PCs to have a cumulative variance explained by the PCs that is >50-When to stop? For example: Enough PCs to have a cumulative variance explained by the PCs that is >50-

70%70%

•Kaiser criterion: keep PCs with eigenvalues >1Kaiser criterion: keep PCs with eigenvalues >1



If variables have very heterogenous variances we standardize them If variables have very heterogenous variances we standardize them

The standardized variables Xi* The standardized variables Xi*

Xi*= (Xi-mean)/Xi*= (Xi-mean)/variancevariance

The new variables all have the same variance, so each variable have the same weight.The new variables all have the same variance, so each variable have the same weight.

REMARKSREMARKS

•PCA is useful for finding new, more informative, uncorrelated features; it reduces dimensionality by PCA is useful for finding new, more informative, uncorrelated features; it reduces dimensionality by

rejecting low variance featuresrejecting low variance features

•PCA is only powerful if the biological question is related to the highest variance in the datasetPCA is only powerful if the biological question is related to the highest variance in the dataset

AlgorithmAlgorithm

Data = (Data.old – mean ) /sqrt(variance)Data = (Data.old – mean ) /sqrt(variance)

Cov(data) = 1/(N-1) Data*tr(Data)Cov(data) = 1/(N-1) Data*tr(Data)

Find Eigenvector/Eigenvalue (Function in R and matlab: eig) and sortFind Eigenvector/Eigenvalue (Function in R and matlab: eig) and sort

Eigenvectors: VEigenvectors: V

Eigenvalues: PEigenvalues: P

Project the original data: P * dataProject the original data: P * data

Plot as many components as necessaryPlot as many components as necessary

Applications of PCAApplications of PCA

ApplicationsApplications

Include:Include:

Image ProcessingImage Processing Micro array ExperimentsMicro array Experiments Pattern RecognitionPattern Recognition

OUTLINEOUTLINE

Principal component analysis in bioinformaticsPrincipal component analysis in bioinformatics

OUTLINEOUTLINE

Principal component analysis in bioinformaticsPrincipal component analysis in bioinformatics

Example 1Example 1

Lefkovits et al.Lefkovits et al.

SpotsSpots

x1x1

xpxp

Clones 1 nClones 1 n

X is a real valued pxn matrixX is a real valued pxn matrix

They want to analyse relatedness of clonesThey want to analyse relatedness of clones

Cov(X) is a real value nxn matrixCov(X) is a real value nxn matrix

They take Correlation matrix (which is on the top the division by They take Correlation matrix (which is on the top the division by

the standard deviations)the standard deviations)

Lefkovits et al.Lefkovits et al.

Example 2Example 2

Yang et al.Yang et al.


BaboBabotkvtkv

ControlControl

Ulloa-Montoya et al.Ulloa-Montoya et al.

Multipotent Multipotent

Adult progenitor cellsAdult progenitor cells

Pluripotent Pluripotent

Embryonic stem cellsEmbryonic stem cells

Mesenchymal Mesenchymal

stem cellsstem cells

Ulloa-Montoya et al.Ulloa-Montoya et al.


But:But:

We only see the different experimentsWe only see the different experiments

If we do it the other way round – that means analysing for the genes not for the experiments we see grouping of If we do it the other way round – that means analysing for the genes not for the experiments we see grouping of

genesgenes

But we never see both together. But we never see both together.

So, can we relate somehow the experiments and the genes? So, can we relate somehow the experiments and the genes?

That means group genes whose expression might be explained by the the respective experimental group (tkv, That means group genes whose expression might be explained by the the respective experimental group (tkv,

babo, control)?babo, control)?

This goes into „correspondence analysis“This goes into „correspondence analysis“

Extensions of PCAExtensions of PCA

Difficult exampleDifficult example

Non-linear PCANon-linear PCA

Kernel PCAKernel PCA

(http://research.microsoft.com/users/Cambridge/nicolasl/papers/eigen_dimred.pdf)(http://research.microsoft.com/users/Cambridge/nicolasl/papers/eigen_dimred.pdf)

PCA in feature spacePCA in feature space




Side remarkSide remark

Summary of kernel PCASummary of kernel PCA

Multidimensional Scaling Multidimensional Scaling (MDS)(MDS)

Common stress functionsCommon stress functions

Documents

Acknowledgements