Upload
oki
View
27
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Bioinformatics Other data reduction techniques Kristel Van Steen, PhD, ScD ([email protected]) Université de Liege - Institut Montefiore 2008-2009. Acknowledgements. Material based on: work from Pradeep Mummidi class notes from Christine Steinhoff. Outline. - PowerPoint PPT Presentation
Citation preview
BioinformaticsBioinformaticsOther data reduction techniquesOther data reduction techniques
Kristel Van Steen, PhD, ScDKristel Van Steen, PhD, ScD
([email protected])([email protected])
Université de Liege - Institut MontefioreUniversité de Liege - Institut Montefiore
2008-20092008-2009
AcknowledgementsAcknowledgements
Material based on: Material based on:
work from Pradeep Mummidiwork from Pradeep Mummidi
class notes from class notes from Christine SteinhoffChristine Steinhoff
Outline
Intuition behind PCA Theory behind PCA Applications of PCA Extensions of PCA
Multidimensional scaling MDS (not to be confused with MDR)
Intuition behind PCAIntuition behind PCA
IntroductionIntroduction
Most of the scientific or industrial data is Most of the scientific or industrial data is Multivariate data (huge size of data)Multivariate data (huge size of data)
Is all the data useful?Is all the data useful?
If not, how do we quickly extract useful If not, how do we quickly extract useful information only?information only?
ProblemProblem
When we use traditional techniques,When we use traditional techniques, 1. Not easy to extract useful information from the 1. Not easy to extract useful information from the
multivariate datamultivariate data
1) Many bivariate plots are needed1) Many bivariate plots are needed
2) Bivariate plots, however, mainly represent 2) Bivariate plots, however, mainly represent correlations between variables (not samples).correlations between variables (not samples).
Visualization ProblemVisualization Problem
Not easy to visualize multivariate dataNot easy to visualize multivariate data - 1D: dot- 1D: dot
- 2D: Bivariate plot (i.e. X-Y plane)- 2D: Bivariate plot (i.e. X-Y plane)
- 3D: X-Y-Z plot - 3D: X-Y-Z plot
- 4D: ternary plot with a color code /Tetrahedron- 5D, - 4D: ternary plot with a color code /Tetrahedron- 5D, 6D, etc. : ???6D, etc. : ???
Visualization?????Visualization?????
As the number of variables increases, data space becomes harder to visualizeAs the number of variables increases, data space becomes harder to visualize
Basics of PCABasics of PCA
PCA is useful when we need to extract useful PCA is useful when we need to extract useful information from multivariate data sets.information from multivariate data sets.
This technique is based on the reduced This technique is based on the reduced dimensionality.dimensionality.
Therefore, trends in multivariate data are easily Therefore, trends in multivariate data are easily visualized.visualized.
Variable Reduction Variable Reduction ProcedureProcedure
Principal component analysis is a variable reduction Principal component analysis is a variable reduction procedure. It is useful when you have obtained data on a procedure. It is useful when you have obtained data on a number of variables (possibly a large number of variables), number of variables (possibly a large number of variables), and believe that there is some redundancy in those variablesand believe that there is some redundancy in those variables
Redundancy means that some of the variables are correlated Redundancy means that some of the variables are correlated with one another, possibly because they are measuring the with one another, possibly because they are measuring the same construct.same construct.
Because of this redundancy, you believe that it should be Because of this redundancy, you believe that it should be
possible to reduce the observed variables into a smaller possible to reduce the observed variables into a smaller number of principal components (artificial variables) that will number of principal components (artificial variables) that will account for most of the variance in the observed variables.account for most of the variance in the observed variables.
What is Principal What is Principal ComponentComponent
A A principal component principal component can be defined as a linear can be defined as a linear combination of optimally-weighted observed variables.combination of optimally-weighted observed variables.
Based on how subject scores on a principal component Based on how subject scores on a principal component are are
computed.computed.
7 Item measure of Job 7 Item measure of Job SatisfactionSatisfaction
General FormulaGeneral Formula
Below is the general form for the formula to compute Below is the general form for the formula to compute scores on the first component extracted (created) in a scores on the first component extracted (created) in a principal component analysis:principal component analysis:
C1 = b 11(X1) + b12(X 2) + ... b1p(Xp)C1 = b 11(X1) + b12(X 2) + ... b1p(Xp) wherewhere
C1 = the subject’s score on principal component 1 (the first C1 = the subject’s score on principal component 1 (the first component extracted)component extracted)
b1p = the regression coefficient (or weight) for observed b1p = the regression coefficient (or weight) for observed variable p, as used invariable p, as used in
creating principal component 1creating principal component 1 Xp = the subject’s score on observed variable p.Xp = the subject’s score on observed variable p.
For example, assume that component 1 in the present For example, assume that component 1 in the present study was the “satisfaction with supervision” study was the “satisfaction with supervision” component. You could determine each subject’s score component. You could determine each subject’s score on principal component 1 byon principal component 1 by
using the following fictitious formula:using the following fictitious formula:
C1 = .44 (X1) + .40 (X2) + .47 (X3) + .32 (X4)+ .02 C1 = .44 (X1) + .40 (X2) + .47 (X3) + .32 (X4)+ .02 (X5) + .01 (X6) + .03 (X7)(X5) + .01 (X6) + .03 (X7)
Obviously, a different equation, with different Obviously, a different equation, with different regression weights, would be used to compute subject regression weights, would be used to compute subject scores on component 2 (the satisfaction with pay scores on component 2 (the satisfaction with pay component). Below is a fictitious illustration of this component). Below is a fictitious illustration of this formula:formula:
C2 = .01 (X1) + .04 (X2) + .02 (X3) + .02 (X4)+ .48 C2 = .01 (X1) + .04 (X2) + .02 (X3) + .02 (X4)+ .48 (X5) + .31 (X6) + .39 (X7)(X5) + .31 (X6) + .39 (X7)
Number of components Number of components ExtractedExtracted
If a principal component analysis were performed on If a principal component analysis were performed on data from the 7-item job satisfaction questionnaire, data from the 7-item job satisfaction questionnaire, only two components was created. However, such an only two components was created. However, such an impression would not be entirely correct.impression would not be entirely correct.
In reality, the number of components extracted in a In reality, the number of components extracted in a principal component analysis is equal to the number principal component analysis is equal to the number of observed variables being analyzed. of observed variables being analyzed.
However, in most analyses, only the first few However, in most analyses, only the first few components account for meaningful amounts of components account for meaningful amounts of variance, so only these first few components are variance, so only these first few components are retained, interpreted, and used in subsequent retained, interpreted, and used in subsequent analyses (such as in multiple regression analyses).analyses (such as in multiple regression analyses).
Characteristics of principal Characteristics of principal componentscomponents
The first component extracted in a principal component The first component extracted in a principal component analysis accounts for a maximal amount of total variance analysis accounts for a maximal amount of total variance in the observed variables.in the observed variables.
Under typical conditions, this means that the first Under typical conditions, this means that the first component will be correlated with at least some of the component will be correlated with at least some of the observed variables. It may be correlated with many.observed variables. It may be correlated with many.
The second component extracted will have two The second component extracted will have two important characteristics. First, this component will important characteristics. First, this component will account for a maximal amount of variance in the data account for a maximal amount of variance in the data set that was not accounted for by the first component. set that was not accounted for by the first component.
Under typical conditions, this means that the second Under typical conditions, this means that the second component will be correlated with some of the observed component will be correlated with some of the observed variables that did not display strong correlations with variables that did not display strong correlations with component 1.component 1.
The second characteristic of the second component is The second characteristic of the second component is that it will be that it will be uncorrelated uncorrelated with the first component. with the first component. Literally, if you were to compute the correlation between Literally, if you were to compute the correlation between components 1 and 2, that correlation would be zero.components 1 and 2, that correlation would be zero.
The remaining components that are extracted in the The remaining components that are extracted in the analysis display the same two characteristics: each analysis display the same two characteristics: each component accounts for a maximal amount of variance component accounts for a maximal amount of variance in the observed variables that was not accounted for by in the observed variables that was not accounted for by the preceding components, and is uncorrelated with all the preceding components, and is uncorrelated with all of the preceding components.of the preceding components.
GeneralizationGeneralization
A principal component analysis proceeds in this A principal component analysis proceeds in this
fashion, with each new component accounting for fashion, with each new component accounting for progressively smaller and smaller amounts of variance progressively smaller and smaller amounts of variance (this is why only the first few components are usually (this is why only the first few components are usually retained and interpreted).retained and interpreted).
When the analysis is complete, the resulting When the analysis is complete, the resulting components will display varying degrees of correlation components will display varying degrees of correlation with the observed variables, but are completely with the observed variables, but are completely uncorrelated with one another.uncorrelated with one another.
ReferencesReferences
http://support.sas.com/publishing/pubcat/chaps/55129.pdf
http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf
http://www.cis.hut.fi/jhollmen/dippa/node30.html
Theory behind PCATheory behind PCA
Theory behind PCATheory behind PCALinear Algebra
OUTLINEOUTLINE
What do we need from „linear algebra“ for understanding What do we need from „linear algebra“ for understanding
principal component analysis ?principal component analysis ?
•Standard deviation, Variance, CovarianceStandard deviation, Variance, Covariance
•The Covariance matrixThe Covariance matrix
•Symmetric matrix and orthogonalitySymmetric matrix and orthogonality
•Eigenvalues and EigenvectorsEigenvalues and Eigenvectors
•PropertiesProperties
Motivation Motivation
Motivation Motivation
Protein1Protein1
Pro
tein
2Pro
tein
2
Proteins 1 and 2 measured for 200 patientsProteins 1 and 2 measured for 200 patients
MotivationMotivation
GenesGenes
11
22,00022,000
Patients 1 200Patients 1 200
Microarray ExperimentMicroarray Experiment
? Visualize ?? Visualize ?
? Which genes are important ?? Which genes are important ?
? For which subgroup of patients ?? For which subgroup of patients ?
MotivationMotivation
Patients 1 10Patients 1 10
GenesGenes
11
200200
Basics for Principal Component AnalysisBasics for Principal Component Analysis
•Orthogonal/OrthonormalOrthogonal/Orthonormal
•Some Theorems...Some Theorems...
•Standard deviation, Variance, CovarianceStandard deviation, Variance, Covariance
•The Covariance matrixThe Covariance matrix
•Eigenvalues and EigenvectorsEigenvalues and Eigenvectors
Standard DeviationStandard Deviation
The average distance from the mean of the data set to a pointThe average distance from the mean of the data set to a point
MEAN:MEAN:
Example:Example:
Measurement 1: 0,8,12,20Measurement 1: 0,8,12,20
Measurement 2: 8,9,11,12Measurement 2: 8,9,11,12
M1M1 M2M2
Mean 10Mean 10 Mean 10Mean 10
SD 8.33SD 8.33 SD 1.83SD 1.83
VarianceVariance
Example:Example:
Measurement 1: 0,8,12,20Measurement 1: 0,8,12,20
Measurement 2: 8,9,11,12Measurement 2: 8,9,11,12
M1M1 M2M2
Mean 10Mean 10 Mean 10Mean 10
SD 8.33SD 8.33 SD 1.83SD 1.83
Var 69.33Var 69.33 Var 3.33Var 3.33
CovarianceCovariance
Standard Deviation and Variance are 1-dimensionalStandard Deviation and Variance are 1-dimensional
How much do the dimensions vary from the mean with respect to each other ?How much do the dimensions vary from the mean with respect to each other ?
Covariance measures between 2 dimensionsCovariance measures between 2 dimensions
We easily see, if X=Y we end up with varianceWe easily see, if X=Y we end up with variance
Covariance MatrixCovariance Matrix
Let XLet X be a random vector. be a random vector.
Then the covariance matrix of XThen the covariance matrix of X,, denoted by Cov(X) denoted by Cov(X),, is is
The diagonals of Cov(X) The diagonals of Cov(X) are are ..
In matrix notation, In matrix notation,
The covariance matrix is The covariance matrix is symmetricsymmetric
Symmetric MatrixSymmetric Matrix
Let be a square matrix of size nxn. The matrix A is symmetric, ifLet be a square matrix of size nxn. The matrix A is symmetric, if
for all for all
Orthogonality/OrthonormalityOrthogonality/Orthonormality
0.5 1.0 1.50.5 1.0 1.5
1.51.5
11
0.50.5
<v1,v2> = <(1 0),(0 1)><v1,v2> = <(1 0),(0 1)>
= 0= 0
Unit vectors which are orthogonal are said to be orthonormal. Unit vectors which are orthogonal are said to be orthonormal.
Two vectors v1 and v2 for which <v1,v2>=0 holds are said to be orthogonal Two vectors v1 and v2 for which <v1,v2>=0 holds are said to be orthogonal
Eigenvalues/EigenvectorsEigenvalues/Eigenvectors
Let A be an nxn square matrix and x an nx1 column vector. Then a (right) Let A be an nxn square matrix and x an nx1 column vector. Then a (right)
eigenvector of A is a nonzero vector x such that:eigenvector of A is a nonzero vector x such that:
For some scalarFor some scalar
EigenvalueEigenvalue EigenvectorEigenvector
Procedure: Procedure:
Finding the eigenvaluesFinding the eigenvalues
=0=0 Finding lambdas Finding lambdas
Finding corresponding eigenvectors Finding corresponding eigenvectors
R: R: eigen(matrix)eigen(matrix)
Matlab: Matlab: eig(matrix)eig(matrix)
Some RemarksSome Remarks
If If AA and and BB are matrices whose sizes are such that the given operations are are matrices whose sizes are such that the given operations are
defined and defined and cc is any scalar then, is any scalar then,
( )
( )
( )
( )
t t
t t t
t t
t t t
A A
A B A B
cA cA
AB B A
Now,…Now,…
We have enough definitions to go into the procedure how toWe have enough definitions to go into the procedure how to
perform Principal Component Analysisperform Principal Component Analysis
Theory behind PCATheory behind PCALinear algebra applied
OUTLINEOUTLINE
What is principal component analysis good for?What is principal component analysis good for?
Principal Component Analysis: PCAPrincipal Component Analysis: PCA
•The basic Idea of Principal Component AnalysisThe basic Idea of Principal Component Analysis
•The idea of transformationThe idea of transformation
•How to get there ? The mathematics partHow to get there ? The mathematics part
•Some remarksSome remarks
•Basic algorithmic procedureBasic algorithmic procedure
Idea of PCAIdea of PCA
•Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of Introduced by Pearson (1901) and Hotelling (1933) to describe the variation in a set of
multivariate data in terms of a set of uncorrelated variablesmultivariate data in terms of a set of uncorrelated variables
•We typically have a data matrix of We typically have a data matrix of nn observations on observations on pp correlated variables correlated variables x1,x2,…xpx1,x2,…xp
•PCA looks for a transformation of the PCA looks for a transformation of the xi xi into into pp new variables new variables yiyi that are uncorrelated that are uncorrelated
IdeaIdea
GenesGenes
x1x1
xpxp
Patients 1 Patients 1
nn
Dimension highDimension high
So how can we reduce the dimension ? So how can we reduce the dimension ?
Simplest way: take the first one, two, three;Simplest way: take the first one, two, three;
Plot and discard the rest:Plot and discard the rest:
Obviously a very bad idea.Obviously a very bad idea.
Matrix: XMatrix: X
TransformationTransformation
We want to find a transformation that involves ALL columns, not only the first We want to find a transformation that involves ALL columns, not only the first
onesones
So find a new basis, order it such that in the first component lies almost ALL So find a new basis, order it such that in the first component lies almost ALL
information of the whole datasetinformation of the whole datasetLooking for a transformation of the data matrix Looking for a transformation of the data matrix XX ( (ppxxnn) such that) such that
Y= Y= TT XX==1 X1+ 1 X1+ 2 X2+..+ 2 X2+..+ p Xpp Xp
TransformationTransformation
Maximize the variance of the projection of the observations on the Y variables !Maximize the variance of the projection of the observations on the Y variables !
Find Find such thatsuch that
Var(Var(T T
XX) is maximal) is maximal
The matrix The matrix C=Var(X)C=Var(X) is the covariance matrix of the is the covariance matrix of the Xi Xi variablesvariables
What is a reasonable choice for the What is a reasonable choice for the ? ?
Remember: Remember: We wanted a transformation that maximizes „information“ We wanted a transformation that maximizes „information“
That means: captures „Variance in the data“That means: captures „Variance in the data“
TransformationTransformation
Can we intuitively see that in a picture?Can we intuitively see that in a picture?
GoodGood BetterBetter
TransformationTransformation
PC1PC1PC2PC2
OrthogonalityOrthogonality
How do we get there?How do we get there?
GenesGenes
x1x1
xpxp
Patients 1 Patients 1
nn
X is a real valued pxn matrixX is a real valued pxn matrix
Cov(X) is a real value pxp matrix or nxn matrixCov(X) is a real value pxp matrix or nxn matrix
-> decide whether you want to analyse patient groups-> decide whether you want to analyse patient groups
Or do you want to analyse gene groups?Or do you want to analyse gene groups?
)(..........
........)(
........)(
21
2221
1211
ppp
p
p
xv),xc(x),xc(x
),xc(xxv),xc(x
),xc(x),xc(xxv
Cov(X)=Cov(X)=
Lets decide for genes:Lets decide for genes:
How do we get there?How do we get there?
How do we get thereHow do we get there
Some Features on Cov(X)Some Features on Cov(X)
•Cov(X) is a symmetric pxp matrixCov(X) is a symmetric pxp matrix
•The diagonal terms of Cov(X) are the variance genes across patientsThe diagonal terms of Cov(X) are the variance genes across patients
•The off-diagonal terms of Cov(X) are the covariance between gene vectorsThe off-diagonal terms of Cov(X) are the covariance between gene vectors
•Cov(X) captures the correlations between all possible pairs of measurementsCov(X) captures the correlations between all possible pairs of measurements
•In the diagonal terms, by assumption, large values correspond to interesting dynamicsIn the diagonal terms, by assumption, large values correspond to interesting dynamics
•In the off diagonal terms large values correspond to high redundancyIn the off diagonal terms large values correspond to high redundancy
How do we get there?How do we get there?
The principal Components of X are the Eigenvectors of Cov(X)The principal Components of X are the Eigenvectors of Cov(X)
Assume, we can „manipulate“ X a bit: Lets call this YAssume, we can „manipulate“ X a bit: Lets call this Y
Y should be manipulated in a way that it is a bit more optimal than X wasY should be manipulated in a way that it is a bit more optimal than X was
What does optimal mean?What does optimal mean?
That means: That means:
VarVarVarVar
VarVar
CovCov
LARGE!LARGE!
SMALL!SMALL!
In other words: should be diagonal and large values on the diagonalIn other words: should be diagonal and large values on the diagonal
How do we get there?How do we get there?
The manipulation is a change of the basis with orthonormal vectors The manipulation is a change of the basis with orthonormal vectors
And they are ordered in a way that the most important comes first (principal) ...And they are ordered in a way that the most important comes first (principal) ...
How do we put this in mathematical terms?How do we put this in mathematical terms?Find orthonormal P such that Find orthonormal P such that
Y = P X Y = P X With Cov(Y) diagonalizedWith Cov(Y) diagonalized
Then the rows of P are the principal components of XThen the rows of P are the principal components of X
How do we get there?How do we get there?
1( ) ( )( )
1tCov Y PX PX
n
1
1t tPXX P
n
1( )
1t tP XX P
n
1
1tPAP
n
Y PX Cov(Y) = 1/(n-1) YY Cov(Y) = 1/(n-1) YY tt
A:=XX A:=XX tt
How do we get there?How do we get there?
tA EDE
: tP E
tA P DP
A is symmetricA is symmetric
Therefore there is a matrix E of eigenvectors and a diagonal matrix D such that:Therefore there is a matrix E of eigenvectors and a diagonal matrix D such that:
Now define P to be the transpose of the matrix E of eigenvectorsNow define P to be the transpose of the matrix E of eigenvectors
Then we can write A:Then we can write A:
How do we get there?How do we get there?
Now we can go back to our Covariance Expression:Now we can go back to our Covariance Expression:
1
1tPAP
n
Cov(Y)Cov(Y)
1( ) ( )
1t tCov Y P P DP P
n
1( ) ( )1
t tPP D PPn
How do we get there?How do we get there?
1 tP P
1 11( ) ( )1PP D PP
n
1
1D
n
The inverse of an orthogonal matrix is its transpose (due to its definition): The inverse of an orthogonal matrix is its transpose (due to its definition):
In our context that means:In our context that means:
Cov(Y)Cov(Y)
How do we get there?How do we get there?
P diagonalizes Cov(Y)P diagonalizes Cov(Y)
Where P is the transpose of the matrix of Eigenvectors of XX Where P is the transpose of the matrix of Eigenvectors of XX tt
The principal components of X are the eigenvectors of XX The principal components of X are the eigenvectors of XX tt
(thats the same as the rows of P)(thats the same as the rows of P)
The ith diagonal value of Cov(Y) is the variance of X along pi (=along the ith principal)The ith diagonal value of Cov(Y) is the variance of X along pi (=along the ith principal)
Essentially we need to compute Essentially we need to compute
EIGENVALUES and EIGENVECTORSEIGENVALUES and EIGENVECTORS
Explained varianceExplained variance Principal componentsPrincipal components
Of the covariance matrix of the original matrix XOf the covariance matrix of the original matrix X
Some RemarksSome Remarks
•If you multiply one variable by a scalar you get different results If you multiply one variable by a scalar you get different results
•This is because it uses covariance matrix (and not correlation)This is because it uses covariance matrix (and not correlation)
•PCA should be applied on data that have approximately the same scale in each variablePCA should be applied on data that have approximately the same scale in each variable
•The relative variance explained by each PC is given by The relative variance explained by each PC is given by eigenvalue/sum(eigenvalues)eigenvalue/sum(eigenvalues)
• When to stop? For example: Enough PCs to have a cumulative variance explained by the PCs that is >50-When to stop? For example: Enough PCs to have a cumulative variance explained by the PCs that is >50-
70%70%
•Kaiser criterion: keep PCs with eigenvalues >1Kaiser criterion: keep PCs with eigenvalues >1
Some RemarksSome Remarks
Some RemarksSome Remarks
If variables have very heterogenous variances we standardize them If variables have very heterogenous variances we standardize them
The standardized variables Xi* The standardized variables Xi*
Xi*= (Xi-mean)/Xi*= (Xi-mean)/variancevariance
The new variables all have the same variance, so each variable have the same weight.The new variables all have the same variance, so each variable have the same weight.
REMARKSREMARKS
•PCA is useful for finding new, more informative, uncorrelated features; it reduces dimensionality by PCA is useful for finding new, more informative, uncorrelated features; it reduces dimensionality by
rejecting low variance featuresrejecting low variance features
•PCA is only powerful if the biological question is related to the highest variance in the datasetPCA is only powerful if the biological question is related to the highest variance in the dataset
AlgorithmAlgorithm
Data = (Data.old – mean ) /sqrt(variance)Data = (Data.old – mean ) /sqrt(variance)
Cov(data) = 1/(N-1) Data*tr(Data)Cov(data) = 1/(N-1) Data*tr(Data)
Find Eigenvector/Eigenvalue (Function in R and matlab: eig) and sortFind Eigenvector/Eigenvalue (Function in R and matlab: eig) and sort
Eigenvectors: VEigenvectors: V
Eigenvalues: PEigenvalues: P
Project the original data: P * dataProject the original data: P * data
Plot as many components as necessaryPlot as many components as necessary
Applications of PCAApplications of PCA
ApplicationsApplications
Include:Include:
Image ProcessingImage Processing Micro array ExperimentsMicro array Experiments Pattern RecognitionPattern Recognition
OUTLINEOUTLINE
Principal component analysis in bioinformaticsPrincipal component analysis in bioinformatics
OUTLINEOUTLINE
Principal component analysis in bioinformaticsPrincipal component analysis in bioinformatics
Example 1Example 1
Lefkovits et al.Lefkovits et al.
SpotsSpots
x1x1
xpxp
Clones 1 nClones 1 n
X is a real valued pxn matrixX is a real valued pxn matrix
They want to analyse relatedness of clonesThey want to analyse relatedness of clones
Cov(X) is a real value nxn matrixCov(X) is a real value nxn matrix
They take Correlation matrix (which is on the top the division by They take Correlation matrix (which is on the top the division by
the standard deviations)the standard deviations)
Lefkovits et al.Lefkovits et al.
Example 2Example 2
Yang et al.Yang et al.
Yang et al.Yang et al.
BaboBabotkvtkv
ControlControl
Ulloa-Montoya et al.Ulloa-Montoya et al.
Multipotent Multipotent
Adult progenitor cellsAdult progenitor cells
Pluripotent Pluripotent
Embryonic stem cellsEmbryonic stem cells
Mesenchymal Mesenchymal
stem cellsstem cells
Ulloa-Montoya et al.Ulloa-Montoya et al.
Yang et al.Yang et al.
But:But:
We only see the different experimentsWe only see the different experiments
If we do it the other way round – that means analysing for the genes not for the experiments we see grouping of If we do it the other way round – that means analysing for the genes not for the experiments we see grouping of
genesgenes
But we never see both together. But we never see both together.
So, can we relate somehow the experiments and the genes? So, can we relate somehow the experiments and the genes?
That means group genes whose expression might be explained by the the respective experimental group (tkv, That means group genes whose expression might be explained by the the respective experimental group (tkv,
babo, control)?babo, control)?
This goes into „correspondence analysis“This goes into „correspondence analysis“
Extensions of PCAExtensions of PCA
Difficult exampleDifficult example
Non-linear PCANon-linear PCA
Kernel PCAKernel PCA
(http://research.microsoft.com/users/Cambridge/nicolasl/papers/eigen_dimred.pdf)(http://research.microsoft.com/users/Cambridge/nicolasl/papers/eigen_dimred.pdf)
PCA in feature spacePCA in feature space
PCA in feature spacePCA in feature space
PCA in feature spacePCA in feature space
PCA in feature spacePCA in feature space
Side remarkSide remark
Summary of kernel PCASummary of kernel PCA
Multidimensional Scaling Multidimensional Scaling (MDS)(MDS)
Common stress functionsCommon stress functions