1
PLS Path ModelingPLS Path Modeling
Michel TenenhausMichel Tenenhaus([email protected])([email protected])
2
4
5
6
7
PLS Methodsinitiated by Herman Wold, Svante Wold, Harald
Martens and Jan-Bernd Lohmöller
1. NIPALS (Nonlinear Iterative Partial Least Squares)
2. PLS Regression (Partial Least Squares Regression)
3. PLS Discriminant Analysis
4. SIMCA (Soft Independent Modeling by Class Analogy)
5. PLS Approach to Structural Equation Modeling
6. N-way PLS
7. PLS Logistic Regression
8. PLS Generalized Linear Model
8
PLS Methods
PLS Path Modeling:
PLS Approach to Structural Equation Modeling
9
ECSI Path model for a“ Mobile phone provider”
Image
Perceivedvalue
CustomerExpectation
Perceivedquality
Loyalty
Customersatisfaction
Complaint
.493 (.000)
R2=.243
.545 (.000)
.066 (.314)
.037 (.406)
.153 (.006)
.212 (.002)
.540(.000)
.544 (.000)
.200 (.000)
.466(.000)
.540(.000)
.05 (.399)
R2=.297
R2=.335 R2=.672
R2=.432
R2=.292
10
Structural Equation Modeling The PLS approach of Herman WOLD
• Study of a system of linear relationships between latent variables.
• Each latent variable is described by a set of manifest variables, or summarizes them.
• Variables can be numerical, ordinal or nominal (no need for normality assumptions).
• The number of observations can be small compare to the number of variables.
11
Economic inequality and political instability Data from Russett (1964), in GIFI
Economic inequalityAgricultural inequality
GINI : Inequality of land distributions
FARM : % farmers that own half of the land (> 50)
RENT : % farmers that rent all their land
Industrial developmentGNPR : Gross national product per
capita ($ 1955)
LABO : % of labor force employed in agriculture
Political instabilityINST : Instability of executive
(45-61)
ECKS : Nb of violent internal war incidents (46-61)
DEAT : Nb of people killed as a result of civic group violence (50-62)
D-STAB : Stable democracy
D-UNST : Unstable democracy
DICT : Dictatorship
12
Economic inequality and political instability (Data from Russett, 1964)
GiniFarmRentGnprLaboInstEcksDeatDemo
Argentine86.398.232.93742513.6572172
Australie 92.999.6*12151411.30 01
Autriche 74.097.410.75323212.84 02
France 58.386.126.010462616.34612
Yougoslavie43.779.80.0297670.09 03
1 = Stable democracy2 = Unstable democracy3 = Dictatorship
47 countries
13
Economic inequality and political instability
GINI
FARM
RENT
GNPR
LABO
Agricultural inequality (X1)
Industrialdevelopment (X2)
ECKS
DEAT
D-STB
D-INS
INST
DICT
Politicalinstability (X3)
1
2
3
++
+
+
-
+++-
++
+
-
14
Modeling• Reflective model (the block is supposed to be uni-dimensionnel)
Each manifest variable Xjh is written as : Xjh = jhh + jh
• Formative model (the block can be multi-dimensionnel)
The latent variable h is a function of the manifest variables of its block Xh :
• There exists a linear structural relationship between the latent variables:
Political instability (3)
= 1Agri. inequality (1) + 2Ind. development (2) + residual
h j hj hj
X
15
Estimation of latent variables using the PLS approach
(1) External (outer) estimation Yh of h
Yh = Xhwh
(2) Internal (inner) estimation Zh of h
(3) Calculation of wh
whj = cor(Zh , Xhj)
jhj
hξjhj
h Y))],(cor[signZ
withrelated
(
16
Estimation of latent variables using the PLS approach
(1) External estimation Yh of h Yh = Xhwh
Y1 = w11Gini + w12Farm + w13Rent
Y2 = X2w2
Y3 = X3w3
17
Estimation of latent variables using the PLS approach
(2) Internal estimation Zh of h (Centroid scheme)
Z1 = sign(cor(1, 3)Y3 = (+1)Y3
Z2 = sign(cor(2, 3)Y3 = (-1)Y3
Z3 = sign(cor(3, 1)Y1 + sign(cor(3, 2)Y2
= (+1)Y1 + (-1)Y2
jhj
hξjhj
h Y))],(cor[signZ
withrelated
(
1
2
3
+
-
18
(3) Calculation of the weights wh
whj = cor(Xhj , Zh)
w11 = cor(Gini , Z1)
w12 = cor(Farm , Z1)
w13 = cor(Rent , Z1)
And the same way for the other whj.
Estimation of latent variables using the PLS approach
19
Option “1” : All weights are equal to 1.
Option “–1” : All weights equal to 1, except the last one put to –1.
w11,initial = 1
w12,initial = 1
w13,initial = -1
Weight initialization in PLS-graph
This choice allows some sign control:
If the variable with the largest weight is put on last position, this weight will have a good chance to be negative.
20
Economic inequality and political instabilityEstimation of latent variables with PLS Approach
(1) External estimation
Y1 = X1w1
Y2 = X2w2
Y3 = X3w3
(2) Internal estimation
Z1 = Y3
Z2 = -Y3
Z3 = Y1 - Y2
(3) Calculation of wh
w1j = cor(X1j , Z1)
w2j = cor(X2j , Z2)
w3j = cor(X3j , Z3)
Algorithm
• Begin with arbitrary weights w1, w2, w3.
• Get new weights wh by using (1) to (3).
• Iterate until convergence.
21
Use of PLS-Graph (Wynne Chin)
22
Résults
(Corrélation)
Loading = coeff. de régression de Xhj sur Yh ,
= cor(Xhj, Yh) si les X sont centrées-réduites
Outer Model =============================== Variable Weight Loading ------------------------------- Ineg_agri outward gini 0.4567 0.9745 farm 0.5125 0.9857 rent 0.1018 0.5156 ------------------------------- Dev_ind outward gnpr 0.5113 0.9501 labo -0.5384 -0.9551 ------------------------------- Inst_pol outward inst 0.1187 0.3676 ecks 0.2855 0.8241 death 0.2977 0.7910 demostab -0.3271 -0.8635 demoinst 0.0370 0.1037 dictatur 0.2758 0.7227 =================================
23
ResultsEta .. Latent variables======================================== ineg_agr dev_indu inst_pol----------------------------------------arg .964 .238 .755aus 1.204 1.371 -1.617aut .397 .253 -.480bel -.812 1.530 -.846bol 1.115 -1.584 1.505bré .778 -.654 .302...tai -.009 -.898 -.068ru .134 2.059 -1.046eu .193 2.016 -.942uru .699 .179 -1.298ven 1.149 .252 1.135rfa -.212 1.104 -.494you -2.189 -.654 .125========================================
24
Latent variable estimation
Y1 Y2 Y3
Argentine 0.96 0.24 0.75
Australie 1.20 1.37 -1.62
Autriche 0.39 0.25 -0.48
France -0.88 0.80 0.56
Yougoslavie -2.19 -0.65 0.13
3 1 Multiple regression of Y on Y and Y2
R2 = 0.618
Political instability= 0.217Agricultural inequality – 0.692 Industrial development
(2.24) (-7.22)
Student t coming from multiple regression results
PLS results
25
Economic inequality and political instability
GINI
FARM
RENT
GNPR
LABO
Agricultural inequality (X1)
Industrialdevelopment (X2)
ECKS
DEAT
D-STB
D-UNS
INST
DICT
Politicalinstability (X3)
1
2
3
.974
.986
.516
.950
-.955
.368
.824.791
-.864
.104
.723
.217
-.692R2 = 0.618
26
Map of countries : Y1 = agricultural inequality , Y2 = industrial development
Y2 „ƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒ† ‚ ‚ ‚ 2.0 ˆ royaume-uni(1) ** états-unis(1) ˆ ‚ ‚ ‚ ‚ ‚ ‚ ‚ * canada(1) ‚ ‚ ‚ * suisse(1) ‚ ‚ 1.5 ˆ * belgique(1) ‚ ˆ ‚ * suède(1) ‚ australie(1) * ‚ ‚ ‚ * nouv._zélande(1) ‚ ‚ * pays-bas(1) ‚ ‚ ‚ * rfa(2) ‚ 1.0 ˆ * luxembourg(1) ˆ ‚ france(2) ‚ ‚ ‚ * danemark(1) * * norvège(1)‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚ 0.5 ˆ ‚ ˆ ‚ ‚ ‚ ‚ * finlande(2) ‚ * autriche(2) ‚ ‚ ‚ italie(2) * * argentine(2)‚ ‚ * irlande(1) ‚ uruguay(1) *venezuela(3) ‚ 0.0 ˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆ ‚ ‚ ‚ ‚ ‚ * cuba(3) ‚ ‚ * pologne(3) ‚ chili(2) * ‚ ‚ * japon(2) ‚ * panama(3) * colombie(2) ‚-0.5 ˆ ‚ grèce(2) * * * costa-rica(2)ˆ ‚ * yougoslavie(3) nicaragua(3)* Espagne(3)*brésil(2) ‚ ‚ ‚ salvador(3)* * * équateur(3) ‚ ‚ * philippines(3) rép_dominic.(3) ‚ ‚ taiwan(3) * guatémala(3) * ‚-1.0 ˆ ‚ pérou(3) * * irak(3) ˆ ‚ sud_vietnam(3) * ** honduras(3) ‚ ‚ ‚ égypte(3) ‚ ‚ ‚ ‚ ‚ * libye(3) ‚-1.5 ˆ * inde(1) ‚ ˆ ‚ ‚ bolivie(3) * ‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚-2.0 ˆ ‚ ˆ ‚ ‚ ‚ ŠƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒŒ -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5
Y1
27
Results
Inner Model ======================= Block Mult.RSq ----------------------- Inégalit 0.0000 Développ 0.0000 Instabil 0.6180========================
Path coefficients ======================================== Ineg_agr Dev_indu Inst_pol ---------------------------------------- Ineg_agr 0.000 0.000 0.000 Dev_indu 0.000 0.000 0.000 Inst_pol 0.217 -0.692 0.000 ========================================
Correlations of latent variables ======================================== Ineg_agr Dev_indu Inst_pol ---------------------------------------- Ineg_agr 1.000 Dev_indu -0.309 1.000 Inst_pol 0.431 -0.759 1.000 ========================================
28
Results
Outer Model ======================================================== Variable Weight Loading Communality Redundancy -------------------------------------------------------- Ineg_agri outward gini 0.4567 0.9745 0.9496 0.0000 farm 0.5125 0.9857 0.9716 0.0000 rent 0.1018 0.5156 0.2659 0.0000 -------------------------------------------------------- Dev_indu outward gnpr 0.5113 0.9501 0.9027 0.0000 labo -0.5384 -0.9551 0.9123 0.0000 -------------------------------------------------------- Inst_pol outward inst 0.1187 0.3676 0.1352 0.0835 ecks 0.2855 0.8241 0.6792 0.4197 death 0.2977 0.7910 0.6257 0.3867 demostab -0.3271 -0.8635 0.7457 0.4608 demoinst 0.0370 0.1037 0.0107 0.0066 dictatur 0.2758 0.7227 0.5223 0.3228 ========================================================
Communality = Cor(Xhj, Yh)2 = Loading2
For endogenous LV : Redundancy = Cor2(Xhj, Yh)*R2(Yh, LVs explaining Yh)
Average= 0.28
29
Résultats Inner Model =========================================================== Block Mult.RSq AvCommun AvRedund Goodness of Fit ----------------------------------------------------------- Ineg_agri 0.0000 0.7290 0.0000 Dev_indu 0.0000 0.9075 0.0000 Inst_pol 0.6180 0.4531 0.2800 ----------------------------------------------------------- Average 0.6180 0.6110 0.2800 .614 ===========================================================
= Average Variance of Xh explained by Yh
= AVEh
hp2
h hj hj 1h
1(Average communality) Cor (X ,Y )
p
A latent variable must explain at least 50% of its block variance.
Average Communality = (3*AvCommun1 + 2*AvCommun2 + 6*AvCommun3)/11
Value of the externalmodel
Value of theinternal model .618 .611GoF
30
A global index of model fitPLS Goodness of Fit
Inner Model =================================================== Block Mult.RSq AvCommun AvRedund GoF --------------------------------------------------- Ineg_agri 0.0000 0.7290 0.0000 Dev_indu 0.0000 0.9075 0.0000 Inst_pol 0.6180 0.4531 0.2800 --------------------------------------------------- Average 0.618 0.6110 0.2800 .614 ===================================================
hp2 2
h h hj hh h j 1
GoF
1 1R (Y ,Other LVs explaining Y * Cor (X ,Y )
Nb of endogenous LVs Nb of MVs
Outermodel
Innermodel
31
Discriminant validity
A LV explains more its own MVs than the other LVs
AVE and square correlations
========================================
Ineg_agr Dev_indu Inst_pol
----------------------------------------
Ineg_agr 0.729
Dev_indu 0.095 0.907
Inst_pol 0.186 0.576 0.453
========================================
AVE(Yj) must be larger than the cor2(Yj,Yh) for all h
????
32
Using PLS-Graph
(t=1.705)
(t=-7.685)
t coming from bootstrap re-sampling
33
Bootstrap validation in PLS-GraphSign control: Individual sign changes / Construct level changes*
Outer Model Loadings:==================================================================== Entire Mean of Standard T-Statistic sample subsamples error estimateInégalité agricole: gini 0.9745 0.9584 0.0336 28.9616 farm 0.9857 0.9689 0.0329 29.9339 rent 0.5156 0.4204 0.2462 2.0946
Développement industriel: gnpr 0.9501 0.9489 0.0121 78.3692 labo -0.9551 -0.9536 0.0107 -89.1493
Instabilité politique: inst 0.3676 0.3347 0.1756 2.0932 ecks 0.8241 0.8138 0.0699 11.7920 demostab -0.8635 -0.8520 0.0667 -12.9419 demoinst 0.1037 0.0955 0.1611 0.6438 dictatur 0.7227 0.7195 0.0841 8.5915 death 0.7910 0.7977 0.0528 14.9773====================================================================
(*) used here
34
Bootstrap validation in PLS-GraphSign control
Individual sign changes
Construct level changes (Default)
Each bootstrapped sign weight is automatically put equal
to the full sample sign weight.
For each LV (Construct) the weights are globally inversed
if the new loadings (after inversion) are closer to the full
sample loadings than the bootstrapped loadings (before
inversion).
35
PLS-Graph : Bootsrap Validation Path Coefficients Table (Entire Sample Estimate):==================================================================== Inég. Agric. Dev. Indust. Instab. Pol. Inég. Agric. 0.0000 0.0000 0.0000Dev. Indust. 0.0000 0.0000 0.0000Inst. Pol. 0.2170 -0.6920 0.0000====================================================================
Path Coefficients Table (Mean of Subsamples):==================================================================== Inég. Agric. Dev. Indust. Instab. Pol. Inég. Agric. 0.0000 0.0000 0.0000Dev. Indust. 0.0000 0.0000 0.0000Inst. Pol. 0.2328 -0.6743 0.0000====================================================================
Path Coefficients Table (Standard Error):==================================================================== Inég. Agric. Dev. Indust. Instab. Pol.Inég. Agric. 0.0000 0.0000 0.0000Dev. Indust. 0.0000 0.0000 0.0000Instabil 0.1272 0.0900 0.0000====================================================================
Path Coefficients Table (T-Statistic)==================================================================== Inég. Agric. Dev. Indust. Instab. Pol. Inég. Agric. 0.0000 0.0000 0.0000Dev. Indust. 0.0000 0.0000 0.0000Inst. Pol. 1.7054 -7.6855 0.0000====================================================================
36
SPECIAL CASES OF PLS PATH MODELLING
• Principal component analysis• Multiple factor analysis• Canonical correlation analysis• Redundancy analysis• PLS Regression• Generalized canonical correlation analysis (Horst)• Generalized canonical correlation analysis (Carroll)
37
Options of the PLS algorithm
External estimation
Yj = Xjwj
Mode A (for reflective) :
wjh = cor(Xjh , Zj)
Mode B (for formative) :
wj = (Xj´Xj)-1Xj´Zj
Internal estimation
Centroid scheme
eji = sign of cor(Yi,Yj)
Factorial scheme
eji = cor(Yi,Yj)
Path weighting scheme
eji = regression coeff. in the
regression of Yj on the Yi’s
ijij YeZ
38
The general PLS algorithm
wj
Choice of weights e:Centroid, Factorialor Path weighting scheme
Initialstep
Yj = XjwjOuter
Estimation(standardized) Yj2
Yj1
Yjm
Zj
ej1
ej2
ejm
Innerestimation
Mode A: wj =
Mode B: wj =
1'j jX Z
n
11 1( ' ) ( ' )j j j jX X X Zn n
w cor(X,Z)
Look at the loading, not at the w
Some modified multi-block methods for SEM
S U M C O R ( H o r s t , 1 9 6 1 ) ,
( , )j k j kj kM a x c C o r F F
M a t h e s ( 1 9 9 3 ) , H a n a f i ( 2 0 0 4 ) : 2
,( , ) j k j kj k
M a x c C o r F F
M a t h e s ( 1 9 9 3 ) , H a n a f i ( 2 0 0 4 ) ,
| ( , ) | j k j kj kM a x c C o r F F
M A X B E T
( V a n d e G e e r , 1 9 8 4 ) : A l l 1
[ ( ) ( , ) ]j
j j j k j j k kj j kwM a x V a r X w c C o v X w X w
M A X D I F F
( V a n d e G e e r , 1 9 8 4 ) : A l l 1
[ ( , ) ]j
j k j j k kj kwM a x c C o v X w X w
PLS : B, Centroid
PLS: B, Factorial
cjk = 1 if blocks are linked, 0 otherwise
PLS : A, HorstNEW APPROACH
PLS: B, Horst (New)
MAXDIFF B(Hanafi & Kiers, 2006)
2
1( , )
iij i i j j
All wi j
Max c Cov X w X w PLS : A, Factorial
NEW APPROACH
40
PLS approach : 2 blocks
X1 X2
Mode for weight calculation
Y1 = X1w1 Y2 = X2w2 Method Deflation
A A PLS regression of X2 on X1 On X1 only
B A Redundancy analysis of X2 with respect to X1 On X1 only
A A Tucker Inter-Battery Factor Analysis On X1 and X2
B B Canonical correlation Analysis On X1 and X2
(*)
(*) Deflation: Working on residuals of the regression of X on the previous LV’s in order to obtain orthogonal LV’s.
41
PLS regression (2 components)
dim 1
dim 2
- Mode A for X
- Mode A for Y
- Deflate only X
1
1
( , )
( , )* ( ) * ( )
a b
a b
Max Cov Xa Yb
Max Cor Xa Yb Var Xa Var Yb
42
-4
-3
-2
-1
0
1
2
3
-4 -3 -2 -1 0 1 2 3 4
t[2]
t[1]
australie
belgique
canada
danemark
inde
irlande
luxembourg
pays-bas
nouvelle zélande
norvège
suède
suisse
royaume-uni
états-unis
uruguay
autriche
brésil
japon
argentine
autriche
brésil
chili
colombie
costa-rica
finlande
france grèce
italie
japon
West Germany bolivie
cuba
rép. dominicaine
équateur
égypte
salvadorguatémala
honduras
irak
libye
nicaraguapanama
pèrou
philippines
pologne
sud vietnam
espagne
taiwan
venezuela
yougoslavie
PLS Regression in SIMCA-P : PLS Scores
43
-1.00
-0.80
-0.60
-0.40
-0.20
0.00
0.20
0.40
0.60
0.80
1.00
-1.00 -0.80 -0.60 -0.40 -0.20 0.00 0.20 0.40 0.60 0.80 1.00
pc(c
orr)
[Com
p. 2
]
pc(corr)[Comp. 1]
GINI
FARM
RENT
GNPR
LABO
INST
ECKS
DEAT
DEMOSTB
DEMOINST
DICTATURE
Correlation loadings
44
Redundancy analysis of X on Y (2 components)
- Mode A for X
- Mode B for Y
- Deflate only X
( ) 1
( ) 1
( , )
( , )* ( )
a Var Yb
a Var Yb
Max Cov Xa Yb
Max Cor Xa Yb Var Xa
dim 1
dim 2
2
( ) 1 ( , )j
Var Ybj
Max Cor x Yb
45
Inter-battery factor analysis (2 components)
dim 1
dim 2
- Mode A for X
- Mode A for Y
- Deflate both
X and Y
1
1
( , )
( , )* ( ) * ( )
a b
a b
Max Cov Xa Yb
Max Cor Xa Yb Var Xa Var Yb
46
Canonical correlation analysis (2 components)
dim 1
dim 2
- Mode B for X
- Mode B for Y
- Deflate both
X and Y
( ) ( ) 1 ( , )
Var Xa Var YbMax Cov Xa Yb
47
PLS approach : K blocks
X1
XK
.
.
.
X1
.
.
.
XK
X
Deflation: On the super-block only
Scheme for internal estimation calculation
Mode for weightcalculation Centroid Factorial Structural
A
- PCA of X- Multiple Factor
Analysis of the Xj’s- ACOM
BGeneralized CanonicalCorrelation Analysis(Horst)
Generalized CanonicalCorrelation Analysis(Carroll)
(Chessel & Hanafi)
NEW ! NEW !
NEW !
Scheme for internal estimation calculation
Mode for weightcalculation Centroid Factorial Structural
A
- PCA of X- Multiple Factor
Analysis of the Xj’s- ACOM
BGeneralized CanonicalCorrelation Analysis(Horst)
Generalized CanonicalCorrelation Analysis(Carroll)
(Chessel & Hanafi)
NEW ! NEW !
NEW !
48
A new PLS algorithmArthur & Michel Tenenhaus
cij = 1 if blocks are linked, 0 otherwise
,
,
2
,
Horst scheme : ( , )
Centroid scheme : ( , )
Factorial scheme : ( , )
ij i i j ji j
ij i i j ji j
ij i i j ji j
Maximize c Cov X a X a
Maximize c Cov X a X a
Maximize c Cov X a X a
2
subject to the following constraints :
1 ( ) 1i i i i ia Var X a 2
For = 1 Mode A : 1
For = 0 Mode B : ( ) 1i i
i i i
a
Var X a
49
CONCLUSION
• PLS IS TO COVARIANCE-BASED SEM AS PRINCIPAL COMPONENT ANALYSIS IS TO FACTOR ANALYSIS.
• WHEN INDIVIDUAL DATA ARE AVAILABLE, SIGNIFICANCE TESTS CAN BE CARRIED OUT WITH PLS BY CROSS VALIDATION METHODS.
50
Michel TenenhausMichel [email protected]@hec.fr
European Customer Satisfaction Index
PLS Path Modelling versus LISREL
51
The European Customer Satisfaction Index (ECSI)
• ECSI is an economic indicator that measures customer satisfaction.
• It is an adaptation of the Swedish Customer Satisfaction Barometer and the American Customer Satisfaction Index (ACSI) proposed by Claes Fornell.
• Fornell’s methodology is presented.
52
Path model describing causes and consequences of Customer Satisfaction
.
Image
Perceivedvalue
CustomerExpectation
Perceivedquality
Loyalty
Customersatisfaction
Complaints
Full model in red and blue, Reduced model in red
53
Content of the presentation
• Use of Fornell’s methodology on the full ECSI model
• Use of Fornell’s methodology on the reduced model
• Use of SEM-ML on the reduced model (SEM-ML did not work on the full model)
• Comparison between PLS and SEM-ML results on the reduced model
54
a) Expectations for the overall quality of“your mobil phone provider” at themoment you became customerof this provider.
b) Expectations for “your mobile phoneprovider” to provide products andservices to meet your personal need.
c) How often did you expect that thingscould go wrong at “your mobile phoneprovider” ?
Measurement Instrument for the Mobile Phone Industry : Examples of latent and manifest variables
Customer expectation Customer satisfaction
a) Overall satisfaction
b) Fulfilment of expectations
c) How well do you think “ your mobile phone provider” compares with your ideal mobil phone provider ?
55
Measurement Instrument for the Mobile Phone Industry : Examples of latent and manifest variables
Customer loyaltya) If you would need to choose a new mobile phone provider how likely is it that you would choose “your provider” again ?
b) Let us now suppose that other mobile phone providers decide to lower fees and prices, but “your mobile phone provider” stays at the same level as today. At which level of difference (in %) would you choose another phone provider ?
c) If a friend or colleague asks you for advice, how likely is it that you would recommend “your mobile phone provider” ?
And so on for the other latent variables ...
56
• Each latent variable is estimated as a weighted average of its manifest variables.
• PLS Path modeling is used to estimate the weights with Mode A and Centroid scheme options.
• Path coefficients are computed by multiple regression on the estimated latent variables and t-statistics by cross-validation (bootstrap).
1009
1Vx
I. Study of the complete model usingthe Fornell’s approach
• Manifest variables V are transformed from a scale “ 1-10 ” to a scale “ 0 -100 ” :
57
Use of PLS-Graph (Wynne Chin)
58
Results : The weightsOuter Model ====================== Variable Weight ---------------------- Image outward IMAG1 0.0147 IMAG2 0.0127 IMAG3 0.0137 IMAG4 0.0177 IMAG5 0.0143 ---------------------- Expectat outward CUEX1 0.0232 CUEX2 0.0224 CUEX3 0.0252 ---------------------- Per_Qual outward PERQ1 0.0098 PERQ2 0.0085 PERQ3 0.0118 PERQ4 0.0094 PERQ5 0.0084 PERQ6 0.0095 PERQ7 0.0129 ----------------------
Outer Model ====================== Variable Weight ---------------------- Per_Valu outward PERV1 0.0239 PERV2 0.0247 ---------------------- Satisfac outward CUSA1 0.0158 CUSA2 0.0231 CUSA3 0.0264 ---------------------- Complain outward CUSCO 0.0397 ---------------------- Loyalty outward CUSL1 0.0185 CUSL2 0.0061 CUSL3 0.0225 ======================
59
Fornell’s computation of the latent variables
Example : Customer Satisfaction Index
0.0158 CUSA1 0.0231 CUSA2 0.0264 CUSA3CSI
0.0158 0.0231 0.0264
M ean an d stan d ard d ev ia tion o f th e la ten t variab les
250 26.49 100.00 72.6878 13.7660
250 25.85 100.00 72.3198 14.1259
250 23.95 100.00 74.5765 14.2573
250 .00 100.00 61.5887 20.5987
250 23.68 100.00 71.2876 15.3417
250 .00 100.00 67.4704 25.2684
250 1.29 100.00 69.1757 21.2668
IMAGE
CUSTOMER EXPECTATION
PERCEIVED QUALITY
PERCEIVED VALUE
CUSTOMER SATISFACTION
COMPLAINT
LOYALTY
N Minimum Maximum Mean Std. Deviation
60
ImageCustomer
expectationPerceived
qualityPerceived
valueCustomer
satisfaction Complaint LoyaltyImage1Image2Image3Image4Image5
.717
.565
.657
.791
.698
.571
.571
.544
.539
.543
.500C_exp1C_exp2C_exp3
.689
.644
.724P_qual1P_qual2P_qual3P_qual4P_qual5P_qual6P_qual7
.622.
.621.
.599
.551
.596
.537 .778.651.801.760.732.766.803 .547
.661
.651
.587
.516
.539
.707P_val1P_val2 .541 .594
.933
.911 .631 .524C_sat1C_sat2C_sat3
.558
.524
.613
.638
.672
.684 .588
.711
.872
.884 .547 .610Complaint .537 .540 1Loyalty1Loyalty2Loyalty3 .528 .537 .659
.854
.869
Correlations between manifest variables and latent variables
Correlations below 0.5 in absolute value are not shown.
61
Image
Perceivedvalue
CustomerExpectation
Perceivedquality
Loyalty
Customersatisfaction
Complaint
.492 (7.67)
R2=.242
.544 (10.71)
.066 (1.10)
.037 (1.14)
.153 (3.07)
.211 (2.54)
.541(6.93)
.543 (8.62)
.201 (3.59)
.468(5.18)
.540(11.08)
.049 (1.11)
R2=.296
R2 =.335 R2=.672
R2=.432
R2=.292
ECSI Path model for a“ Mobile phone provider” Regression on standardized variables and t-statistics provided
by PLS-Graph bootstrap, construct level change option
62
Perceivedvalue
CustomerExpectation
Perceivedquality
LoyaltyCustomersatisfaction
.545 (8.92)
.070 (1.08)
.053 (1.20)
.
.538(6.59)
.638 (3.70)
.216 (12.35) .634 (11.50)
R2 = .402
R2=.297
R2=.335 R2=.660
II. Study of the reduced model using Fornell’s approach
63
The new PLS weights
Weight RelativeWeight
CE1CE2CE3
.0237
.0206
.0262
.336
.292
.372PQ1PQ2PQ3PQ4PQ5PQ6PQ7
.0098
.0085
.0118
.0094
.0084
.0095
.0129
.139
.121
.168
.134
.119
.135
.183
Weight RelativeWeight
PV1PV2
-.0239-.0247
.492
.508CS1CS2CS3
.0157
.0240
.0256
.241
.368
.392CL1CL2CL3
-.0188-.0050-.0226
.405
.108
.487
For each variable the relative weights sum up to 1.
64
III. Study of the reduced model using AMOS: Model 1 (Standardized Results)
CUS_EXP
.30
CE1
e1
.55
.21
CE2
e2
.46
.18
CE3
e3
.42
.74
PER_QUAL
.60
PQ1
e4
.77
.33
PQ2
e5
.56
PQ3
e6
.50
PQ4
e7
.48
PQ5
e8
.50
PQ6
e9
.71
.57
PQ7
e10
.46
PER_VAL.89
PV2e12
.94
.55
PV1e11 .74
.87
CSI
.64
CS3e15
.80
.57
CSI2e14.75
.48
CSI1e13
.65
CUS_LOY
.75
CL3e18
.86
.01
CL2e17.12
.39
CL1e16 .63
.70
.78
.24
.80
.86
-.13d1
d2
d3
d4
.57 .75 .69.71.76
.04.72
Chi-Square = 271DF = 128Chi-Square /DF = 2.12 (.63)
(.83)
65
Reduced model 2 (Standardized results)
Chi-Square = 271DF = 130Chi-Square /DF = 2.08RMSEA = 0.066H0: RMSEA 0.05 : p-value = 0.01
CUS_EXP
.30
CE1
e1
.55
.21
CE2
e2
.45
.18
CE3
e3
.43
.73
PER_QUAL
.60
PQ1
e4
.77
.33
PQ2
e5
.56
PQ3
e6
.50
PQ4
e7
.48
PQ5
e8
.50
PQ6
e9
.71
.57
PQ7
e10
.45
PER_VAL.89
PV2e12
.94
.55
PV1e11 .74
.87
CSI
.64
CS3e15
.80
.57
CSI2e14.75
.48
CSI1e13
.65
CUS_LOY
.75
CL3e18
.86
.01
CL2e17.12
.39
CL1e16 .63
.70
.67
.25
.80
d1
d2
d3
d4
.58 .75 .69.71.76
.75
.86
66
Reduced model 2 (Unstandardized results)
97.94
CUS_EXP
CE1
225.47
e1
1.00
1
CE2
313.57
e2
.91
1
CE3
445.07
e3
1.00
1
PER_QUAL
PQ1
99.30
e4
1.00
1
PQ2
294.34
e51
PQ3
178.38
e61
PQ4
167.50
e71
PQ5
135.99
e81
PQ6
163.63
e9
1.05
1
PQ7
179.21
e10
PER_VAL
PV2
47.08
e12
1.07
1
PV1
262.28
e11 1.001
CSI
CS3
134.06
e15
1.63
1
CSI2
166.09
e141.551
CSI1
96.73
e131
CUS_LOY
CL3
152.38
e18
1.15
1
CL2
977.28
e17.20
1
CL1
529.56
e16 1.001
1.00
.99
.13
1.56
1.06 39.71
d1
177.30
d2
12.01
d3
120.31
d4
1
1
1
1
.99 1.24 .911.061.26
1
.58
Chi-Square = 271.118df = 130
Chi-Square/df = 2.086rmsea = .066
p-value (rmsea =< .05) = .010
67
Specific estimation of the latent variables
• Each latent variable is estimated as a weighted average of its own manifest variables, using the loadings hj .
• For example
is the Customer Satisfaction Index score.
• Each coefficient 4j is the regression coefficient of 4 in the
regression relating the manifest variable X4j to its latent variable
4 (similar to PLS weight estimation when mode A is used).
41 42 434
41 42 43
CUSA1 CUSA2 CUSA3Y
68
Loadings and LISREL weights
Loading Weight
CE1 CE2 CE3
1.000 0.913 1.004
.343
.313
.344 PQ1 PQ2 PQ3 PQ4 PQ5 PQ6 PQ7
1.000 0.988 1.241 1.061 0.911 1.045 1.265.
.133
.132
.165
.141
.121
.139
.168
Loading Weight PV1 PV2
1.000 1.069
.483
.517 CS1 CS2 CS3
1.000 1.549 1.634
.239
.370
.391 CL1 CL2 CL3
1.000 0.202 1.155
.424
.086
.490
69
Comparison between the PLS and LISREL weights
PLS RELATIVE WEIGHT
.6.5.4.3.2.1
LIS
RE
L W
EIG
HT
.6
.5
.4
.3
.2
.1
0.0
70
Correlations between the PLS latent variables and the specific LISREL latent variables
CUS_EXP (LISREL)
12010080604020
CU
S_E
XP
(P
LS
)
3
2
1
0
-1
-2
-3
-4
PER_QUAL (LISREL)
12010080604020
PE
R_Q
UA
L (P
LS
)
2
1
0
-1
-2
-3
-4
PER_VAL (LISREL)
120100806040200-20
PE
R_VA
L (P
LS
)
2
1
0
-1
-2
-3
-4
CSI (LISREL)
12010080604020
CS
I (P
LS
)
2
1
0
-1
-2
-3
-4
CUS_LOY (LISREL)
1 2 01 0 08 06 04 02 00-2 0
CU
S_LO
Y (P
LS
)
2
1
0
-1
-2
-3
-4
All the correlationsare above .998
71
First conclusions
• If COV-BASED SEM works, the PLS results can be derived from the COV-BASED SEM results.
• If COV-BASED SEM does not work, PLS is still an alternative.
• If COV-BASED SEM is not adequate (small number of observations and/or large number of variables) PLS can be used for exploratory purposes.
72
Usual estimation of latent variables in LISREL
Proc calis covariance modification data =ecsi outstat=a; lineqs CUEX1 = 1 f1 + e1,
CUEX2 = Lambda12 f1 + e2, CUEX3 = Lambda13 f1 + e3,
.
.
. CUSL1 = 1 f5 + e16,
CUSL2 = Lambda52 f5 + e17, CUSL3 = Lambda53 f5 + e18,
f2 = beta21 f1 + d2,f3 = beta31 f1 + beta32 f2 + d3,f4 = beta41 f1 + beta42 f2 + beta43 f3 + d4,f5 = beta54 f4 + d5;
std e1-e18 = vare1-vare18,
d2-d5 = vard2-vard5,f1 = varf1;
var CUEX1 CUEX2 CUEX3 PERQ1 PERQ2 PERQ3 PERQ4 PERQ5
PERQ6 PERQ7 PERV1 PERV2 CUSA1 CUSA2 CUSA3 CUSL1CUSL2 CUSL3;
run;proc print data=a (where = (_type_="SCORE"));run;
73
Variable weights for the usual estimation of the latent variables in LISREL
CUEX1 CUEX2 CUEX3 PERQ1 PERQ2 PERQ3 f1 0.11102 0.074334 0.055776 0.07362 0.024507 0.050785 f2 0.03242 0.021705 0.016287 0.12987 0.043233 0.089590 f3 -0.00083 -0.000558 -0.000418 0.01235 0.004112 0.008522 f4 0.01321 0.008842 0.006634 0.04578 0.015241 0.031583 f5 0.00906 0.006064 0.004550 0.03140 0.010453 0.021661
PERQ4 PERQ5 PERQ6 PERQ7 PERV1 PERV2 CUSA1
0.046274 0.049023 0.046702 0.051524 -0.00071 -0.00440 0.0308500.081633 0.086483 0.082388 0.090894 0.00466 0.02873 0.0470980.007765 0.008227 0.007837 0.008646 0.10833 0.66864 0.0270740.028778 0.030488 0.029044 0.032043 0.00992 0.06122 0.0932210.019737 0.020910 0.019920 0.021977 0.00680 0.04199 0.063936
CUSA2 CUSA3 CUSL1 CUSL2 CUSL3
0.027752 0.03606 0.00387 0.000423 0.015330.042369 0.05506 0.00591 0.000645 0.023400.024356 0.03165 0.00340 0.000371 0.013450.083861 0.10898 0.01170 0.001277 0.046320.057516 0.07474 0.10838 0.011830 0.42895
74
Correlations between the PLS latent variables and the usual estimated LISREL latent variables
CUS_EXP (LISREL)
20100-10-20-30-40
CU
S_
EX
P (
PL
S)
3
2
1
0
-1
-2
-3
-4 Rsq = 0.6227
PER_QUAL (LISREL)
3020100-10-20-30-40-50
PE
R_
QU
AL (
PL
S)
2
1
0
-1
-2
-3
-4 Rsq = 0.9599
PER_VAL (LISREL)
40200-20-40-60
PE
R_
VA
L (
PL
S)
2
1
0
-1
-2
-3
-4 Rsq = 0.9010
F4
20100-10-20-30-40
CS
I
2
1
0
-1
-2
-3
-4 Rsq = 0.8981
F5
40200-20-40-60
CU
S_
LO
Y
2
1
0
-1
-2
-3
-4 Rsq = 0.8592
75
Final conclusions
• COV-BASED SEM did not work on the full model.• COV-BASED SEM gives better results for the inner model
(relating the latent variables between them) because the latent variables are space-free.
• PLS gives better results for the outer model (relating the manifest variables to their latent variables) because each latent variable is constrained to be in its own manifest variables space.
• If each COV-BASED SEM latent variable is estimated as a weighted average of its own manifest variables, then COV-BASED SEM and PLS give almost identical latent variable estimates (at least on the examples we have studied).
76
PLS Path Modeling and Multiple Table Analysis
Application to the study of the cosmetic habits of women in Ile-de-France
Christiane Guinot (CERIES/CHANEL)
& Michel Tenenhaus (HEC)
77
Objective of the analysis
We have applied the PLS approach to a studyof the cosmetic habits of women living inIle-de-France
The aim of the project was to obtain a global score describing the propensity to use cosmetic productsin this sample
Then, we used behavioural and skin characteristic variables, which are known to account for the variation in use of cosmetic products, to check onthe relevance of this score
78
The cosmetic products were divided intofour blocks corresponding to different cosmetic practices Sun
care
Bodycare
Facecare
make-up and eye make-up removers, tonic lotions, day creams, night creams exfoliation products
sun protection products for face and for body after-sun products for face and for body
MakMake-upe-up
blushers, mascaras, eye shadows, eye pencils, lipsticks, lip shiners and nail polish
Data
soap, liquid soap, moisturising body cream, hand creams
79
Construction of a global score
Manifestvariables
Cosmeticpractices
LatentVariable (A)
Globalscore
PartialScores
LatentVariables (B)
Body careBody care
Face careFace care
Make-upMake-up
Sun careSun care
11
22
33
44
Body careBody care
Face Face carecare
Make-Make-upup Sun careSun care
Inner model :
centroid scheme
80
Propensity to usecosmetic products
Globalscore
Results
11
22
33
44
-.24 .47 .80 .56
.44
.56
.56
.46
.55
.57
.57
.72
.58
.43
.53
.47
.64
.73
.71
.78
-.12
.23 .40 .28 .32 .40 .40 .33 .39 .41 .39 .49 .39 .29 .36 .31
.39 .45 .44 .49
.27
.42
.43
.44
Facecare
MakMake-upe-up
Sun care
soap bodyliquid soapbody creamhand cream
Bodycare
make-up rem.tonic lotioneye m.up rem. day creamnight creamexfoliation pdt
protec faceafter sun faceprotec bodyafter sun body
blushermascara eye shadow eye pencil lipsticknail polish
Soap bodyliquid soapbody creamhand creammake-up removertonic lotioneye m.up remover day creamnight creamexfoliation pdtblushermascara eye shadow eye pencil lipsticknail polishsun protec. faceafter sun facesun protec. bodyafter sun body
Correlations
Regressioncoefficients
81
Global score Global score = -3.40- .11 * soaps and toilet soaps for body care
+.20 * liquid soaps for body care+.38 * moisturising body creams and milks+.25 * hand creams and milks+.21 * make-up removers+.26 * tonic lotions +.30 * eye make-up removers+.39 * moisturising day creams+.30 * moisturising night creams+.30 * exfoliation products+.26 * blushers+.41 * mascara+.26 * eye shadows+.20 * eye pencils+.33 * lipsticks and lip shiners+.20 * nail polish+.36 * sun protection products for the face+.31 * moisturising after sun products for the face+.38 * sun protection products for the body+.34 * moisturising after sun products for the body
Result : Global score
82
Results: partial scores
Score body-careScore body-care Score facial-careScore facial-care
Score make-upScore make-up Score sun-careScore sun-care
83
S_body-care S_facial-care S_make-up S_sun-care
S_facial-care 0.24001 S_make-up 0.13462 0.35035 S_sun-care 0.16500 0.19075 0.14273
S_global S_global 0.50263 0.71846 0.67347 0.62071 0.50263 0.71846 0.67347 0.62071
Result : global score
84
Relevance of the global scoreFactors influencing the use of cosmetic productsTo identify behavioural and skin characteristic
variables which best account for the variation in the use of cosmetic products, we can relate the global score to the following variables:
Professional activity & Socio-professional
category
Children Sun exposure habits Practice of sport Importance of physical appearance Type of facial skin & type of body skin Age
85
Relevance of the global score
E(Score global) = -1.02+.21 * professional activity
+.07 * housewife or student+.00 * retired+.27 * CSP A (craftsmen, trades people, business
managers, managerial staff,academics and professionals)
+.09 * CSP B (farmers and intermediary professions)+.05 * CSP C (employees and working class people)+.00 * CSP D (retired and non working people)-.21 * without child+.00 * with child +.40 * habits of deliberate exposure to sunlight+.09 * previous habits of deliberate exposure to sunlight+.00 * no habits of deliberate exposure to sunlight-.17 * no sport practised +.00 * sport practised+1.04 * physical appearance is of extreme importance+.89 * physical appearance is of high importance
+.50 * physical appearance is of some importance+.00 * physical appearance is of little importance-.06 * oily facial skin+.16 * combination facial skin-.20 * normal facial skin+.00 * dry facial skin-.32 * oily body skin-.57 * combination body skin-.32 * normal body skin +.00 * dry body skin-.00 * age
86
E(Global score)= -1.02-1.02+.21 * professional activity+.21 * professional activity
+.07 * housewife or student+.00 * retired+.27 * CSP A (craftsmen, trades people, business managers, +.27 * CSP A (craftsmen, trades people, business managers,
managerial managerial staff, academics and professionals) staff, academics and professionals) +.09 * CSP B (farmers and intermediary professions)+.05 * CSP C (employees and working class people)+.00 * CSP D (retired and non working people)- .21 * without child+.00 * with child+.00 * with child +.40 * habits of deliberate exposure to sunlight+.40 * habits of deliberate exposure to sunlight+.09 * previous habits of deliberate exposure to sunlight+.00 * no habits of deliberate exposure to sunlight- .17 * no sport practised +.00 * sport practised+.00 * sport practised+1.04 * physical appearance is of extreme importance+1.04 * physical appearance is of extreme importance+.89 * physical appearance is of high importance
+.50 * physical appearance is of some importance+.00 * physical appearance is of little importance- .06 * oily facial skin+.16 * combination facial skin+.16 * combination facial skin- .20 * normal facial skin+.00 * dry facial skin- .32 * oily body skin- .57 * combination body skin- .32 * normal body skin +.00 * dry body skin+.00 * dry body skin- .00 * age
1.061.06
A good profile
87
E(Global score)= -1.02-1.02+.21 * professional activity
+.07 * housewife or student+.00 * retired+.00 * retired+.27 * CSP A (craftsmen, trades people, business managers,
managerial staff, academics and professionals)
+.09 * CSP B (farmers and intermediary professions)+.05 * CSP C (employees and working class people)+.00 * CSP D (retired and non working people)+.00 * CSP D (retired and non working people)- .21 * without child- .21 * without child+.00 * with child +.40 * habits of deliberate exposure to sunlight+.09 * previous habits of deliberate exposure to sunlight+.00 * no habits of deliberate exposure to sunlight+.00 * no habits of deliberate exposure to sunlight- .17 * no sport practised - .17 * no sport practised +.00 * sport practised+1.04 * physical appearance is of extreme importance+.89 * physical appearance is of high importance
+.50 * physical appearance is of some importance+.00 * physical appearance is of little importance+.00 * physical appearance is of little importance- .06 * oily facial skin+.16 * combination facial skin- .20 * normal facial skin- .20 * normal facial skin+.00 * dry facial skin- .32 * oily body skin- .57 * combination body skin- .57 * combination body skin- .32 * normal body skin +.00 * dry body skin- .00 * age
-2.00-2.00
A bad profile
88
Conclusion
Using PLS approach, we obtain a scorepresenting the propensity to use cosmeticproducts by balancing the different typesof cosmetic products better than usingprincipal component analysis.
89
Final conclusion
« All the proofs of a pudding are in the eating, not in the cooking ».
William Camden (1623)
90
Some references on PLS Path Modeling
• CHIN W.W. (2001) : PLS-Graph User’s Guide, C.T. Bauer College of Business, University of Houston, Houston.
• CHIN W.W. (1998) : “The partial least squares approach for structural equation modeling”, in: G.A. Marcoulides (Ed.) Modern Methods for Business Research, Lawrence Erlbaum Associates, pp. 295-336.
• FORNELL C. (1992) : “A National Customer Satisfaction Barometer: The Swedish Experience”, Journal of Marketing, Vol. 56, 6-21.
• FORNELL C. & CHA J. (1994) : “Partial Least Squares”, in Advanced Methods of Marketing Research, R.P. Bagozzi (Ed.), Basil Blackwell, Cambridge, MA., pp. 52-78.
• GUINOT, C., LATREILLE, J. & TENENHAUS M.: “PLS Path Modeling and Analysis of Multiple Tables”, Chemometrics and Intelligent Laboratory Systems, Special issue on PLS methods, 58, 2001 (with C. Guinot and J. Latreille).
• LOHMÖLLER J.-B. (1987) : LVPLS Program Manual, Version 1.8, Zentralarchiv für Empirische Sozialforschung, Köln.
91
• LOHMÖLLER J.-B. (1989) : Latent Variables Path Modeling with Partial Least Squares,
Physica-Verlag, Heildelberg.
• PAGÈS J. & TENENHAUS M. (2001) : "Multiple Factor Analysis and
PLS Path Modeling",
Chemometrics and Intelligent Laboratory Systems, 58, 261-273.
• TENENHAUS M. (1998) : La Régression PLS. Éditions Technip, Paris
• TENENHAUS M. (1999) : “L’approche PLS”, Revue de Statistique Appliquée,
vol. 47, n°2, pp. 5-40.
• TENENHAUS M., ESPOSITO VINZI V., CHATELIN Y.-M., LAURO, C. (2005):
"PLS Path Modeling", Computational Statistics and Data Analysis.
• WOLD H. (1985) : “Partial Least Squares”, in Encyclopedia of Statistical Sciences,
vol. 6, Kotz, S & Johnson, N.L. (Eds), John Wiley & Sons, New York, pp. 581-591.