Upload
others
View
16
Download
0
Embed Size (px)
Citation preview
PCA Regularized PCA Results Discussion
Regularized PCA to denoise and
visualise data
Marie Verbanck � Julie Josse � François Husson
Laboratoire de statistique, Agrocampus Ouest, Rennes, France
Statlearn, Bordeaux, 9 april 2013
1 / 35
PCA Regularized PCA Results Discussion
Outline
1 PCA
2 Regularized PCA
MSE point of view
Bayesian point of view
3 Results
4 Discussion
2 / 35
PCA Regularized PCA Results Discussion
Geometrical point of view
Maximize the variance of the projected points
Minimize the reconstruction error
⇒ Approximation of X of low rank (S < p)
‖Xn×p − Xn×p‖2 SVD: XPCA = Un×SΛ
12S×SV
′p×S
Fn×SV′p×S
F = UΛ12 principal components
V principal axes
Alternating least squares (ALS) algorithm: ‖X− FV′‖2
F = XV(V′V)−1
V = X′F(F′F)−1
3 / 35
PCA Regularized PCA Results Discussion
Model in PCA
xij = signal + noise
• Random e�ect model: Factor Analysis, Probabilistic PCA
(Lawley, 1953, Roweis, 1998, Tipping & Bishop, 1999)
⇒ individuals are exchangeable
⇒ relationship between variables
• Fixed e�ect model (Lawley, 1941, Caussinus 1986)
⇒ individuals are not i.i.d
⇒ study the individuals and the variables
• sensory analysis (products - sensory descriptors)• plant breeding (genotypes - environments)
4 / 35
PCA Regularized PCA Results Discussion
Fixed e�ect model
Xn×p = Xn×p + εn×p
xij =∑S
s=1
√dsqis rjs + εij , εij ∼ N (0, σ2)
• Randomness due to the noise
• Individuals have di�erent expectations: individual parameters
• Inferential framework: when n ↗ the number of parameters ↗⇒ Asymptotic: σ → 0
• Maximum likelihood estimates (θ): least squares estimates
XPCA = UΛ12V
′
EM algorithm = ALS algorithm
⇒ Not necessarily the best in MSE
⇒ Shrinkage estimates: minimize E(φθ − θ)2
5 / 35
PCA Regularized PCA Results Discussion
Outline
1 PCA
2 Regularized PCA
MSE point of view
Bayesian point of view
3 Results
4 Discussion
6 / 35
PCA Regularized PCA Results Discussion
Minimizing the MSE
xij = xij + εij =∑S
s=1 xsij + εij
MSE = E
∑i ,j
min(n−1,p)∑s=1
φs x(s)ij − xij
2 with x(s)ij =
√λsuisvjs
MSE = E
∑i ,j
(S∑
s=1
φs x(s)ij − x
(s)ij
)2
+
min(n−1,p)∑s=S+1
(φs x
(s)ij − x
(s)ij
)2For all s ≥ S + 1, x
(s)ij = 0, therefore φS+1 = ... = φmin(n−1,p) = 0.
7 / 35
PCA Regularized PCA Results Discussion
Minimizing the MSE
MSE = E
∑i ,j
(S∑
s=1
φs x(s)ij − xij
)2 with x
(s)ij =
√λsuisvjs
φs =
∑i ,j E(x
(s)ij )xij∑
i ,j E(x(s)2ij )
=
∑i ,j E(x
(s)ij )xij∑
i ,j
(V (x
(s)ij ) +
(E(x
(s)ij ))2)
When σ → 0: E(x(s)ij ) = x
(s)ij (unbiased); V (x
(s)ij ) = σ2
min(n−1;p)(Pazman & Denis, 1996; Denis & Gower, 1999)
φs =
∑i ,j x
(s)2
ij∑i ,j
σ2
min(n−1;p) +∑
i ,j x(s)2
ij
8 / 35
PCA Regularized PCA Results Discussion
Minimizing the MSE
xij = xij + εij =∑S
s=1
√dsqis rjs + εij , εij ∼ N (0, σ2)
MSE = E(∑
i ,j
(∑Ss=1 φs x
(s)ij − xij
)2)with x
(s)ij =
√λsuisvjs
φs =
∑i ,j x
(s)2
ij∑i ,j
σ2
min(n−1;p) +∑
i ,j x(s)2
ij
=ds
npσ2
min(n−1;p) + ds
=variance signal
variance total
φs =λs − σ2
λsfor s = 1, ..., S
9 / 35
PCA Regularized PCA Results Discussion
Regularized PCA
xPCAij =S∑
s=1
√λsuisvjs
xrPCAij =S∑
s=1
(λs − σ2
λs
)√λsuisvjs =
S∑s=1
(√λs −
σ2√λs
)uisvjs
Compromise between hard and soft thresholding
σ2 small → regularized PCA ≈ PCA
σ2 large → individuals toward the center of gravity
σ2 =RSS
ddl=
n∑q
s=S+1 λs
np − p − nS − pS + S2 + S(Xn×p;Un×S ;Vp×S)
(only asympt unbiased)10 / 35
PCA Regularized PCA Results Discussion
Illustration of the regularization
Xsim4×10 = X4×10 + εsim4×10 sim = 1, ..., 1000
PCA rPCA
⇒ rPCA more biased less variable
11 / 35
PCA Regularized PCA Results Discussion
Outline
1 PCA
2 Regularized PCA• MSE point of view• Bayesian point of view
3 Results
12 / 35
PCA Regularized PCA Results Discussion
Outline
1 PCA
2 Regularized PCA• MSE point of view• Bayesian point of view
Y = Xβ + ε, ε ∼ N (0, σ2)
β ∼ N (0, τ2)
βMAP = (X′X + κ)−1X′Y , with κ = σ2/τ2
3 Results
12 / 35
PCA Regularized PCA Results Discussion
Probabilistic PCA (Roweis, 1998, Tipping & Bishop, 1999 )
⇒ A speci�c case of an exploratory factor analysis model
xij = (Zn×SB′p×S)ij + εij , zi ∼ N (0, IS), εi ∼ N (0, σ2Ip)
• Conditional independence : xi |zi ∼ N (Bzi , Ip)
• Distribution for the observations:
xi ∼ N (0,Σ) with Σ = BB′ + σ2Ip
• Explicit solution:• σ2 = 1
p−S
∑p
s=S+1λs
• B = V(Λ− σ2IS)1
2
13 / 35
PCA Regularized PCA Results Discussion
Probabilistic PCA
xij = (Zn×SB′p×S)ij + εij , zi ∼ N (0, IS), εi ∼ N (0, σ2Ip)
⇒ Random e�ect model: no individual parameters
⇒ BLUP E(zi |xi ): Z = XB(B′B + σ2IS)−1
XpPCA = ZB
′
Using B = V(Λ− σ2IS)12 and SVD X = UΛ1/2V′ → XV = Λ1/2U
XpPCA = U(Λ− σ2IS)Λ−
12V x
pPCAij =
S∑s=1
(√λs −
σ2√λs
)uisvjs
Bayesian treatment of the �xed e�ect model with an a priori on zi :
⇒ Regularized PCA14 / 35
PCA Regularized PCA Results Discussion
Probabilistic PCA
⇒ EM algorithm in pPCA: two ridge regressions
Step E: Z = XB(B′B + σ2IS)−1
Step M: B = X′Z(Z′Z + σ2Λ−1)−1
⇒ EM algorithm in PCA: two regressions ‖X− FV′‖2
F = XV(V′V)−1
V = X′F(F′F)−1
15 / 35
PCA Regularized PCA Results Discussion
An empirical Bayesian approach
Model: xij =S∑
s=1
x(s)ij + εij , εij ∼ N (0, σ2)
Prior: x(s)ij ∼ N (0, τ2s )
Posterior: E(x(s)ij |x
(s)ij
)= Φsx
(s)ij with Φs =
τ2sτ2s + 1
min(n−1;p)σ2
Maximizing the likelihood of(x(s)ij
)∼ N (0, τ2s + 1
min(n−1;p)σ2) as a
function of τ2s to obtain: τs2 =
(1npλs − 1
min(n−1;p) σ2)
Φs =λs − np
min(n−1;p) σ2
λs
16 / 35
PCA Regularized PCA Results Discussion
Other Bayesian point of views
X = X + ε = UDV′ + ε
⇒ Bayesian SVD models: overparametrization issue
• Ho� (2009): Uniform prior for U and V special cases of von
Mises-Fisher distributions (Stiefel manifold); (ds)s=1...S ∼ N(0, s2λ
)• Josse & Denis (2013): disregarding the constraints
uis ∼ N (0, 1) E (xij) = 0
vjs ∼ N (0, 1) V (xij) = Ss2λ(ds)s=1...S ∼ N
(0, s2λ
)
17 / 35
PCA Regularized PCA Results Discussion
Outline
1 PCA
2 Regularized PCA
MSE point of view
Bayesian point of view
3 Results
4 Discussion
18 / 35
PCA Regularized PCA Results Discussion
Simulation design
The simulated data:
X = X + ε
Vary several parameters to construct X:
• n/p: 100/20 = 5, 50/50 = 1 and 20/100 = 0.2
• S underlying dimensions: 2, 4, 10
• the ratio d1/d2: 4, 1
εij is drawn from a Gaussian distribution with a standard deviation
chosen to obtain a signal-to-noise ratio (SNR): 4, 1, 0.8
19 / 35
PCA Regularized PCA Results Discussion
Soft thresholding and SURE (Candès et al., 2012)
xSoftij =
min(n,p)∑s=1
(√λs − λ
)+uisvjs ‖X− X‖2 + λ‖X‖∗
Choosing the tunning parameter λ? MSE(λ) = E‖X− SVTλ(X)‖2
⇒ Stein's Unbiased Risk Estimate (SURE)
SURE(SVTλ)(X) = −npσ2 +
min(n,p)∑i
min(λ2, λi ) + 2σ2div(SVTλ)(X)
⇒ λ is obtained minimizing SURE. Remark: σ2 is required
20 / 35
PCA Regularized PCA Results Discussion
Simulation study
n/p S SNR d1/d2 MSE(XPCA, X) MSE(XrPCA, X) MSE(XSURE, X)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01
50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01
MSE in favor of rPCA21 / 35
PCA Regularized PCA Results Discussion
Simulation study
n/p S SNR d1/d2 MSE(XPCA, X) MSE(XrPCA, X) MSE(XSURE, X)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01
50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01
rPCA ≈ PCA when SNR is high - rPCA ≥ PCA when SNR is small21 / 35
PCA Regularized PCA Results Discussion
Simulation study
n/p S SNR d1/d2 MSE(XPCA, X) MSE(XrPCA, X) MSE(XSURE, X)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01
50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01
rPCA ≥ rPCA when d1 ≥ d221 / 35
PCA Regularized PCA Results Discussion
Simulation study
n/p S SNR d1/d2 MSE(XPCA, X) MSE(XrPCA, X) MSE(XSURE, X)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01
50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01
Same order of errors for n/p = 0.2 or 5, smaller for n/p = 1!21 / 35
PCA Regularized PCA Results Discussion
Simulation study
n/p S SNR d1/d2 MSE(XPCA, X) MSE(XrPCA, X) MSE(XSURE, X)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01
50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01
SURE good when data are noisy; S is high
⇒ SNR not a good measure of the level of noise 21 / 35
PCA Regularized PCA Results Discussion
Simulation study
n/p S SNR d1/d2 MSE(XPCA, X) MSE(XrPCA, X) MSE(XSURE, X)100/20 4 4 4 8.25E-04 8.24E-04 1.42E-03100/20 4 4 1 8.26E-04 8.25E-04 1.43E-03100/20 4 1 4 2.60E-01 1.96E-01 2.43E-01100/20 4 1 1 2.47E-01 2.04E-01 2.62E-01100/20 4 0.8 4 7.41E-01 4.27E-01 4.36E-01100/20 4 0.8 1 6.68E-01 4.40E-01 4.83E-0150/50 4 4 4 5.48E-04 5.48E-04 1.04E-0350/50 4 4 1 5.46E-04 5.46E-04 1.04E-0350/50 4 1 4 1.75E-01 1.53E-01 2.00E-0150/50 4 1 1 1.68E-01 1.52E-01 2.09E-0150/50 4 0.8 4 5.07E-01 3.87E-01 3.85E-01
50/50 4 0.8 1 4.67E-01 3.76E-01 4.13E-0120/100 4 4 4 8.28E-04 8.27E-04 1.41E-0320/100 4 4 1 8.29E-04 8.28E-04 1.42E-0320/100 4 1 4 2.55E-01 1.97E-01 2.45E-0120/100 4 1 1 2.48E-01 2.04E-01 2.60E-0120/100 4 0.8 4 7.13E-01 4.15E-01 4.37E-0120/100 4 0.8 1 6.66E-01 4.34E-01 4.78E-01
SURE bad: select too many components(√λs − λ
)+
21 / 35
PCA Regularized PCA Results Discussion
Candes et al. (2012)'s simulation
• Same model of simulation
• n = 200 p = 500, SNR (4, 2, 1, 0.5), S (10, 100)
S SNR MSE(XPCA, X) MSE(XrPCA, X) MSE(XSURE, X)10 4 4.31E-03 4.29E-03 8.74E-0310 2 1.74E-02 1.71E-02 3.29E-0210 1 7.16E-02 6.75E-02 1.16E-0110 0.5 3.19E-01 2.57E-01 3.53E-01100 4 3.79E-02 3.69E-02 4.50E-02100 2 1.58E-01 1.41E-01 1.56E-01100 1 7.29E-01 4.91E-01 4.48E-01
100 0.5 3.16E+00 1.48E+00 8.52E-01
⇒ Good behavior of rPCA
⇒ SURE good when data are noisy and S high
22 / 35
PCA Regularized PCA Results Discussion
Candes et al. (2012)'s simulation
• Same model of simulation
• n = 200 p = 500, SNR (4, 2, 1, 0.5), S (10, 100)
S SNR MSE(XPCA, X) MSE(XrPCA, X) MSE(XSURE, X)10 4 4.31E-03 4.29E-03 8.74E-0310 2 1.74E-02 1.71E-02 3.29E-0210 1 7.16E-02 6.75E-02 1.16E-0110 0.5 3.19E-01 2.57E-01 3.53E-01100 4 3.79E-02 3.69E-02 4.50E-02100 2 1.58E-01 1.41E-01 1.56E-01100 1 7.29E-01 4.91E-01 4.48E-01
100 0.5 3.16E+00 1.48E+00 8.52E-01
MSE of rPCA with 100 dim = 1.48
Data very noisy: MSE of rPCA with 0 dim = 1
22 / 35
PCA Regularized PCA Results Discussion
Individuals representation
100 individuals, 20 variables, 2 dimensions, SNR=0.8, d1/d2 = 4
True con�guration PCA
23 / 35
PCA Regularized PCA Results Discussion
Individuals representation
100 individuals, 20 variables, 2 dimensions, SNR=0.8, d1/d2 = 4
PCA rPCA
24 / 35
PCA Regularized PCA Results Discussion
Individuals representation
100 individuals, 20 variables, 2 dimensions, SNR=0.8, d1/d2 = 4
PCA SURE
24 / 35
PCA Regularized PCA Results Discussion
Individuals representation
100 individuals, 20 variables, 2 dimensions, SNR=0.8, d1/d2 = 4
True con�guration rPCA
⇒ Improve the recovery of the inner-product matrix XX′
25 / 35
PCA Regularized PCA Results Discussion
Variables representation
True con�guration PCA
cor(X7, X9)= 1 0.68
26 / 35
PCA Regularized PCA Results Discussion
Variables representation
PCA rPCA
cor(X7, X9)= 0.68 0.81
27 / 35
PCA Regularized PCA Results Discussion
Variables representation
PCA SURE
cor(X7, X9)= 0.68 0.82
27 / 35
PCA Regularized PCA Results Discussion
Variables representation
True con�guration rPCA
cor(X7, X9)= 1 0.81
⇒ Improve the recovery of the covariance matrix X′X
28 / 35
PCA Regularized PCA Results Discussion
Real data set
• 27 chickens submitted to 4 nutritional statuses
• 12664 gene expressions
PCA rPCA
29 / 35
PCA Regularized PCA Results Discussion
Real data set
• 27 chickens submitted to 4 nutritional statuses
• 12664 gene expressions
SURE rPCA
29 / 35
PCA Regularized PCA Results Discussion
Real data set
• 27 chickens submitted to 4 nutritional statuses
• 12664 gene expressions
SPCA rPCA
Sparse PCA (Witten, 2009)
29 / 35
PCA Regularized PCA Results Discussion
Real data set
PCA rPCA
Bi-clustering from rPCA better �nd the 4 nutritional statuses!
30 / 35
PCA Regularized PCA Results Discussion
Real data set
PCA rPCA
F16R16 with F16R5 and with N in rPCA
Bi-clustering from rPCA better �nd the 4 nutritional statuses!
30 / 35
PCA Regularized PCA Results Discussion
Real data set
SURE rPCA
Bi-clustering from rPCA better �nd the 4 nutritional statuses!
30 / 35
PCA Regularized PCA Results Discussion
Real data set
SPCA rPCA
Bi-clustering from rPCA better �nd the 4 nutritional statuses!
30 / 35
PCA Regularized PCA Results Discussion
Outline
1 PCA
2 Regularized PCA
MSE point of view
Bayesian point of view
3 Results
4 Discussion
31 / 35
PCA Regularized PCA Results Discussion
Conclusion
⇒ Recover a low rank matrix from noisy observations
• singular values thresholding: a popular estimation strategy
• a regularized term a priori:(√
λs − σ2√λs
): Minimize MSE
(asymptotic results) - Bayesian �xed e�ect model
⇒ Good results: recovery of the structure, distances between
individuals, covariance matrix, visualization
⇒ Tuning parameter: the number of dimensions !!
• too few: information lost (noise variance overestimated)
• too many: relevant information taken into account (noise
variance is underestimated); better? Not always!
Noisy schemes → less dimensions to regularize more
32 / 35
PCA Regularized PCA Results Discussion
Conclusion
⇒ Recover a low rank matrix from noisy observations
• singular values thresholding: a popular estimation strategy
• a regularized term a priori:(√
λs − σ2√λs
): Minimize MSE
(asymptotic results) - Bayesian �xed e�ect model
⇒ Good results: recovery of the structure, distances between
individuals, covariance matrix, visualization
⇒ Tuning parameter: the number of dimensions !!
• too few: information lost (noise variance overestimated)
• too many: relevant information taken into account (noise
variance is underestimated); better? Not always!
Noisy schemes → less dimensions to regularize more
32 / 35
PCA Regularized PCA Results Discussion
Selecting the number of dimensions (Josse & Husson, 2011)
?
??
??
??
??
?
??
??
?
?
??
?
?
⇒ EM-CV (Bro et al. 2008)
⇒ MSEPS = 1np
∑ni=1
∑pj=1(xij − x
−ijij )2
⇒ Computational costly
In regression y = Py (Craven & Whaba, 1979): y−ii − yi = yi−yi
1−Pi,i
In PCA vec(X(S)) = P(S) vec(X) X = FV′
P(S)np×np = (P
′V⊗ In) + (I′p ⊗ PF)− (P
′V⊗ PF)
ACVS =1
np
∑i ,j
(xij − xij
1− Pij ,ij
)2
GCVS =1
np×
∑i ,j(xij − xij)
2
(1− tr(P(S))/np)2
33 / 35
PCA Regularized PCA Results Discussion
Selecting the number of dimensions (Josse & Husson, 2011)
?
??
??
??
??
?
??
??
?
?
??
?
?
⇒ EM-CV (Bro et al. 2008)
⇒ MSEPS = 1np
∑ni=1
∑pj=1(xij − x
−ijij )2
⇒ Computational costly
In regression y = Py (Craven & Whaba, 1979): y−ii − yi = yi−yi
1−Pi,i
In PCA vec(X(S)) = P(S) vec(X) X = FV′
P(S)np×np = (P
′V⊗ In) + (I′p ⊗ PF)− (P
′V⊗ PF)
ACVS =1
np
∑i ,j
(xij − xij
1− Pij ,ij
)2
GCVS =1
np×
∑i ,j(xij − xij)
2
(1− tr(P(S))/np)2
33 / 35
PCA Regularized PCA Results Discussion
Selecting the number of dimensions (Josse & Husson, 2011)
?
??
??
??
??
?
??
??
?
?
??
?
?
⇒ EM-CV (Bro et al. 2008)
⇒ MSEPS = 1np
∑ni=1
∑pj=1(xij − x
−ijij )2
⇒ Computational costly
In regression y = Py (Craven & Whaba, 1979): y−ii − yi = yi−yi
1−Pi,i
In PCA vec(X(S)) = P(S) vec(X) X = FV′
P(S)np×np = (P
′V⊗ In) + (I′p ⊗ PF)− (P
′V⊗ PF)
ACVS =1
np
∑i ,j
(xij − xij
1− Pij ,ij
)2
GCVS =1
np×
∑i ,j(xij − xij)
2
(1− tr(P(S))/np)2
33 / 35
PCA Regularized PCA Results Discussion
Selecting the number of dimensions (Josse & Husson, 2011)
?
??
??
??
??
?
??
??
?
?
??
?
?
⇒ EM-CV (Bro et al. 2008)
⇒ MSEPS = 1np
∑ni=1
∑pj=1(xij − x
−ijij )2
⇒ Computational costly
In regression y = Py (Craven & Whaba, 1979): y−ii − yi = yi−yi
1−Pi,i
In PCA vec(X(S)) = P(S) vec(X) X = FV′
P(S)np×np = (P
′V⊗ In) + (I′p ⊗ PF)− (P
′V⊗ PF)
ACVS =1
np
∑i ,j
(xij − xij
1− Pij ,ij
)2
GCVS =1
np×
∑i ,j(xij − xij)
2
(1− tr(P(S))/np)2
33 / 35
PCA Regularized PCA Results Discussion
Selecting the number of dimensions (Josse & Husson, 2011)
?
??
??
??
??
?
??
??
?
?
??
?
?
⇒ EM-CV (Bro et al. 2008)
⇒ MSEPS = 1np
∑ni=1
∑pj=1(xij − x
−ijij )2
⇒ Computational costly
In regression y = Py (Craven & Whaba, 1979): y−ii − yi = yi−yi
1−Pi,i
In PCA vec(X(S)) = P(S) vec(X) X = FV′
P(S)np×np = (P
′V⊗ In) + (I′p ⊗ PF)− (P
′V⊗ PF)
ACVS =1
np
∑i ,j
(xij − xij
1− Pij ,ij
)2
GCVS =1
np×
∑i ,j(xij − xij)
2
(1− tr(P(S))/np)2
33 / 35
PCA Regularized PCA Results Discussion
Selecting the number of dimensions (Josse & Husson, 2011)
?
??
??
??
??
?
??
??
?
?
??
?
?
⇒ EM-CV (Bro et al. 2008)
⇒ MSEPS = 1np
∑ni=1
∑pj=1(xij − x
−ijij )2
⇒ Computational costly
In regression y = Py (Craven & Whaba, 1979): y−ii − yi = yi−yi
1−Pi,i
In PCA vec(X(S)) = P(S) vec(X) X = FV′
P(S)np×np = (P
′V⊗ In) + (I′p ⊗ PF)− (P
′V⊗ PF)
ACVS =1
np
∑i ,j
(xij − xij
1− Pij ,ij
)2
GCVS =1
np×
∑i ,j(xij − xij)
2
(1− tr(P(S))/np)2
33 / 35
PCA Regularized PCA Results Discussion
Best graphics for X = X+ ε
Regularized PCA: biased but closer to X
PCA (unbiased) with con�dence areas? ⇒ Residuals bootstrap
1 residuals (ε = X− X) bootstrap or draw from N (0, σ2): εb
2 Xb = X + εb
3 PCA on Xb: Ub, Λb, Vb ⇒ (U1,Λ1,V1), ..., (UB ,ΛB ,VB)
-5 0 5
-6-4
-20
2
Individuals factor map (PCA)
Dim 1 (52.64%)
Dim
2 (
25.5
9%)
S Michaud
S Renaudie
S Trotignon
S Buisse Domaine
S Buisse Cristal
V Aub Silex
V Aub Marigny
V Font Domaine
V Font Brûlés
V Font Coteaux
-1.0 -0.5 0.0 0.5 1.0
-1.0
-0.5
0.0
0.5
1.0
Variables factor map (PCA)
Dim 1 (52.64%)
Dim
2 (2
5.59
%)
O.Av.Intensité
O.Ap.Intensité
O.AlcoolO.Végétale
O.Champignon
O.Fruitpassion
O.Typicité
G.Intensité
Sucrée
Acide
Amère
Astringent
G.Alcool
Equilibre
G.Typicité
34 / 35
PCA Regularized PCA Results Discussion
Analysis of variance framework
Y11 G1 E1
Y1J
Y21
YIJ
G1
G2
GI
EJ
E2
EJ
Y G E
G1
GI
YJ
E1 EJ
⇒ Related to AMMI (Additive Main e�ects and Multiplicative
Interaction) models in Anova
yij = µ+ αi + βj +S∑
s=1
λsuisvjs + εij
35 / 35