Upload
april-jordan
View
224
Download
0
Tags:
Embed Size (px)
Citation preview
Object Orie’d Data Analysis, Last Time
• Gene Cell Cycle Data
• Microarrays and HDLSS visualization
• DWD bias adjustment
• NCI 60 Data
Today: More NCI 60 Data &
Detailed (math’cal) look at PCA
Last Time: Checked Data Combo, using DWD Dir’ns
DWD Views of NCI 60 DataInteresting Question:
Which clusters are really there?
Issues:
• DWD great at finding dir’ns of separation
• And will do so even if no real structure
• Is this happening here?
• Or: which clusters are important?
• What does “important” mean?
Real Clusters in NCI 60 Data
Simple Visual Approach:
• Randomly relabel data (Cancer Types)
• Recompute DWD dir’ns & visualization
• Get heuristic impression from this
Deeper Approach
• Formal Hypothesis Testing
(Done later)
Random Relabelling #1
Random Relabelling #2
Random Relabelling #3
Random Relabelling #4
Revisit Real Data
Revisit Real Data (Cont.)Heuristic Results:
Strong Clust’s Weak Clust’s Not Clust’s
Melanoma C N S NSCLC
Leukemia Ovarian Breast
Renal Colon
Later: will find way to quantify these ideas
i.e. develop statistical significance
NCI 60 Controversy
• Can NCI 60 Data be normalized?• Negative Indication:• Kou, et al (2002) Bioinformatics, 18,
405-412.– Based on Gene by Gene Correlations
• Resolution:Gene by Gene Data View
vs.Multivariate Data View
Resolution of Paradox: Toy Data, Gene View
Resolution: Correlations suggest “no chance”
Resolution: Toy Data, PCA View
Resolution: PCA & DWD direct’ns
Resolution: DWD Adjusted
Resolution: DWD Adjusted, PCA view
Resolution: DWD Adjusted, Gene view
Resolution: Correlations & PC1 Projection Correl’n
Needed final verification of Cross-platform
Normal’n
• Is statistical power actually improved?
• Will study later
DWD: Why does it work?
Rob Tibshirani Query:• Really need that complicated
stuff?(DWD is complex)
• Can’t we just use means?
• Empirical Fact (Joel Parker):(DWD better than simple methods)
DWD: Why does it work?
Xuxin Liu Observation:• Key is unbalanced sub-sample
sizes(e.g biological subtypes)
• Mean methods strongly affected• DWD much more robust• Toy Example
DWD: Why does it work?
Xuxin Liu Example
• Goals: – Bring colors together– Keep symbols distinct (interesting biology)
• Study varying sub-sample proportions:– Ratio = 1: Both methods great– Ratio = 0.61: Mean degrades, DWD good– Ratio = 0.35: Mean poor, DWD still OK– Ratio = 0.11: DWD degraded, still better
• Later: will find underlying theory
PCA: Rediscovery – Renaming
Statistics: Principal Component Analysis (PCA)
Social Sciences: Factor Analysis (PCA is a subset)
Probability / Electrical Eng:Karhunen – Loeve expansion
Applied Mathematics:Proper Orthogonal Decomposition (POD)
Geo-Sciences: Empirical Orthogonal Functions (EOF)
An Interesting Historical Note
The 1st (?) application of PCA to Functional
Data Analysis:
Rao, C. R. (1958) Some statistical methods
for comparison of growth curves,
Biometrics, 14, 1-17.
1st Paper with “Curves as Data” viewpoint
Detailed Look at PCA
Three important (and interesting) viewpoints:
1. Mathematics
2. Numerics
3. Statistics
1st: Review linear alg. and multivar. prob.
Review of Linear Algebra
Vector Space:
• set of “vectors”, ,
• and “scalars” (coefficients),
• “closed” under “linear combination”
( in space)
e.g.
,
“ dim Euclid’n space”
xa
i
ii xa
d
d
d xx
x
x
x ,...,: 1
1
d
Review of Linear Algebra (Cont.)
Subspace:• subset that is again a vector space• i.e. closed under linear combination• e.g. lines through the origin• e.g. planes through the origin• e.g. subsp. “generated by” a set of vector
(all linear combos of them =
= containing hyperplane
through origin)
Review of Linear Algebra (Cont.)
Basis of subspace: set of vectors that:
• span, i.e. everything is a lin. com. of them
• are linearly indep’t, i.e. lin. Com. is unique
• e.g. “unit vector basis”
• since
d
1
0
0
,...,
0
1
0
,
0
0
1
1
0
0
0
1
0
0
0
1
212
1
d
d
xxx
x
x
x
Review of Linear Algebra (Cont.)
Basis Matrix, of subspace of
Given a basis, ,
create matrix of columns:
dnvv ,...,1
nddnd
n
n
vv
vv
vvB
1
111
1
Review of Linear Algebra (Cont.)
Then “linear combo” is a matrix multiplicat’n:
where
Check sizes:
n
iii aBva
1
na
a
a 1
)1()(1 nndd
Review of Linear Algebra (Cont.)
Aside on matrix multiplication: (linear transformat’n)
For matrices
,
Define the “matrix product”
(“inner products” of columns with rows)
(composition of linear transformations)
Often useful to check sizes:
mkk
m
aa
aa
A
,1,
,11,1
nmm
n
bb
bb
B
,1,
,11,1
m
iniik
m
iiik
m
inii
m
iii
baba
baba
AB
1,,
11,,
1,,1
11,,1
nmmknk
Review of Linear Algebra (Cont.)
Matrix trace:
• For a square matrix
• Define
• Trace commutes with matrix multiplication:
mmm
m
aa
aa
A
,1,
,11,1
m
iiiaAtr
1,)(
BAtrABtr
Review of Linear Algebra (Cont.)
Dimension of subspace (a notion of “size”):
• number of elements in a basis (unique)
• (use basis above)
• e.g. dim of a line is 1
• e.g. dim of a plane is 2
• dimension is “degrees of freedom”
dd dim
Review of Linear Algebra (Cont.)
Norm of a vector:
• in ,
• Idea: “length” of the vector
• Note: strange properties for high ,
e.g. “length of diagonal of unit cube” =
d 2/12/1
1
2 xxxx td
jj
d
d
Review of Linear Algebra (Cont.)
Norm of a vector (cont.):
• “length normalized vector”:
(has length one, thus on surf. of unit sphere
& is a direction vector)
• get “distance” as:
x
x
yxyxyxyxd t ,
Review of Linear Algebra (Cont.)
Inner (dot, scalar) product:
• for vectors and ,
• related to norm, via
yxyxyx td
jjj
1
,
xxxxx t ,
x y
Review of Linear Algebra (Cont.)
Inner (dot, scalar) product (cont.):
• measures “angle between and ” as:
• key to “orthogonality”, i.e. “perpendicul’ty”:
if and only if
yyxx
yx
yx
yxyxangle
tt
t
11 cos,
cos,
x y
yx 0, yx
Review of Linear Algebra (Cont.)
Orthonormal basis :
• All ortho to each other,
i.e. , for
• All have length 1,
i.e. , for
nvv ,...,1
1, ii vv
0, ' ii vv 'ii
ni ,...,1
Review of Linear Algebra (Cont.)
Orthonormal basis (cont.):
• “Spectral Representation”:
where
check:
• Matrix notation: where i.e.
is called “transform (e.g. Fourier, wavelet) of ”
nvv ,...,1
n
iii vax
1
ii vxa ,
iii
n
iii
n
iiii avvavvavx
,,, '1'
'1'
''
aBx Bxa tt xBa t
xa
Review of Linear Algebra (Cont.)
Parseval identity, for
in subsp. gen’d by o. n. basis :
• Pythagorean theorem
• “Decomposition of Energy”
• ANOVA - sums of squares
• Transform, , has same length as ,
i.e. “rotation in ”
x
nvv ,...,1
2
1
22
1
2, aavxx
n
ii
n
ii
a xd
Gram-Schmidt Ortho-normalization
Idea: Given a basis ,
find an orthonormal version,
by subtracting non-ortho part
Review of Linear Algebra (Cont.)
nvv ,...,1
111/ vvu
112211222
,/, uuvvuuvvu
113113311311333
,,/,, uuvuuvvuuvuuvvu
Projection of a vector onto a subspace :
• Idea: member of that is closest to
(i.e. “approx’n”)
• Find that solves:
(“least squares”)
• For inner product (Hilbert) space:
exists and is unique
Review of Linear Algebra (Cont.)x
xV
V
VxPV vxVv
min
xPV
Projection of a vector onto a subspace (cont.):
• General solution in : for basis matrix ,
• So “proj’n operator” is “matrix mult’n”:
(thus projection is another linear operation)
(note same operation underlies least squares)
Review of Linear Algebra (Cont.)
d VB
xBBBBxP tVV
tVVV
1
tVV
tVVV BBBBP
1
Review of Linear Algebra (Cont.)
Projection using orthonormal basis :
• Basis matrix is “orthonormal”:
• So =
= Recon(Coeffs of “in dir’n”)
nnVtV IBB
10
01
,,
,,
1
111
1
1
nnn
n
ntn
t
vvvv
vvvv
vv
v
v
xBBxP tVVV
x V
nvv ,...,1
Review of Linear Algebra (Cont.)
Projection using orthonormal basis (cont.):
• For “orthogonal complement”, ,
and
• Parseval inequality:
V
xPxPx VV 222xPxPx VV
2
1
22
1
22, aavxxxP
n
ii
n
iiV
Review of Linear Algebra (Cont.)
(Real) Unitary Matrices: with
• Orthonormal basis matrix
(so all of above applies)
• Follows that
(since have full rank, so exists …)
• Lin. trans. (mult. by ) is like “rotation” of
• But also includes “mirror images”
ddU IUU t
IUU t 1U
U d
Review of Linear Algebra (Cont.)
Singular Value Decomposition (SVD):
For a matrix
Find a diagonal matrix ,
with entries
called singular values
And unitary (rotation) matrices ,
(recall )
so that
ndX
ndS
),min(1,..., ndss
ddU nnV
IVVUU tt tUSVX
Review of Linear Algebra (Cont.)
Intuition behind Singular Value Decomposition:
• For a “linear transf’n” (via matrix multi’n)
• First rotate
• Second rescale coordinate axes (by )
• Third rotate again
• i.e. have diagonalized the transformation
X
vVSUvVSUvX tt
is
Review of Linear Algebra (Cont.)
r
SVD Compact Representation:
Useful Labeling:
Singular Values in Increasing Order
Note: singular values = 0 can be omitted
Let = # of positive singular values
Then:
Where are truncations of
trnrrrd VSUX
VSU ,,
),min(1 dnss
Review of Linear Algebra (Cont.)
Eigenvalue Decomposition:
For a (symmetric) square matrix
Find a diagonal matrix
And an orthonormal matrix
(i.e. )
So that: , i.e.
ddX
d
D
0
01
ddB
ddtt IBBBB
DBBX tBDBX
Review of Linear Algebra (Cont.)
Eigenvalue Decomposition (cont.):• Relation to Singular Value Decomposition
(looks similar?):• Eigenvalue decomposition “harder”• Since needs • Price is eigenvalue decomp’n is generally
complex• Except for square and symmetric
• Then eigenvalue decomp. is real valued• Thus is the sing’r value decomp. with:
VU
X
BVU