Download ppt - 1 PLS Path Modeling Michel Tenenhaus ([email protected])

1

PLS Path ModelingPLS Path Modeling

Michel TenenhausMichel Tenenhaus([email protected])([email protected])

2

4

5

6

7

PLS Methodsinitiated by Herman Wold, Svante Wold, Harald

Martens and Jan-Bernd Lohmöller

1. NIPALS (Nonlinear Iterative Partial Least Squares)

2. PLS Regression (Partial Least Squares Regression)

3. PLS Discriminant Analysis

4. SIMCA (Soft Independent Modeling by Class Analogy)

5. PLS Approach to Structural Equation Modeling

6. N-way PLS

7. PLS Logistic Regression

8. PLS Generalized Linear Model

8

PLS Methods

PLS Path Modeling:

PLS Approach to Structural Equation Modeling

9

ECSI Path model for a“ Mobile phone provider”

Image

Perceivedvalue

CustomerExpectation

Perceivedquality

Loyalty

Customersatisfaction

Complaint

.493 (.000)

R2=.243

.545 (.000)

.066 (.314)

.037 (.406)

.153 (.006)

.212 (.002)

.540(.000)

.544 (.000)

.200 (.000)

.466(.000)

.540(.000)

.05 (.399)

R2=.297

R2=.335 R2=.672

R2=.432

R2=.292

10

Structural Equation Modeling The PLS approach of Herman WOLD

• Study of a system of linear relationships between latent variables.

• Each latent variable is described by a set of manifest variables, or summarizes them.

• Variables can be numerical, ordinal or nominal (no need for normality assumptions).

• The number of observations can be small compare to the number of variables.

11

Economic inequality and political instability Data from Russett (1964), in GIFI

Economic inequalityAgricultural inequality

GINI : Inequality of land distributions

FARM : % farmers that own half of the land (> 50)

RENT : % farmers that rent all their land

Industrial developmentGNPR : Gross national product per

capita ($ 1955)

LABO : % of labor force employed in agriculture

Political instabilityINST : Instability of executive

(45-61)

ECKS : Nb of violent internal war incidents (46-61)

DEAT : Nb of people killed as a result of civic group violence (50-62)

D-STAB : Stable democracy

D-UNST : Unstable democracy

DICT : Dictatorship

12

Economic inequality and political instability (Data from Russett, 1964)

GiniFarmRentGnprLaboInstEcksDeatDemo

Argentine86.398.232.93742513.6572172

Australie 92.999.6*12151411.30 01

Autriche 74.097.410.75323212.84 02

France 58.386.126.010462616.34612

Yougoslavie43.779.80.0297670.09 03

1 = Stable democracy2 = Unstable democracy3 = Dictatorship

47 countries

13

Economic inequality and political instability

GINI

FARM

RENT

GNPR

LABO

Agricultural inequality (X1)

Industrialdevelopment (X2)

ECKS

DEAT

D-STB

D-INS

INST

DICT

Politicalinstability (X3)

1

2

3

++

+

+

-

+++-

++

+

-

14

Modeling• Reflective model (the block is supposed to be uni-dimensionnel)

Each manifest variable Xjh is written as : Xjh = jhh + jh

• Formative model (the block can be multi-dimensionnel)

The latent variable h is a function of the manifest variables of its block Xh :

• There exists a linear structural relationship between the latent variables:

Political instability (3)

= 1Agri. inequality (1) + 2Ind. development (2) + residual

h j hj hj

X

15

Estimation of latent variables using the PLS approach

(1) External (outer) estimation Yh of h

Yh = Xhwh

(2) Internal (inner) estimation Zh of h

(3) Calculation of wh

whj = cor(Zh , Xhj)

jhj

hξjhj

h Y))],(cor[signZ

withrelated

(

16


(1) External estimation Yh of h Yh = Xhwh

Y1 = w11Gini + w12Farm + w13Rent

Y2 = X2w2

Y3 = X3w3

17


(2) Internal estimation Zh of h (Centroid scheme)

Z1 = sign(cor(1, 3)Y3 = (+1)Y3

Z2 = sign(cor(2, 3)Y3 = (-1)Y3

Z3 = sign(cor(3, 1)Y1 + sign(cor(3, 2)Y2

= (+1)Y1 + (-1)Y2

jhj

hξjhj

h Y))],(cor[signZ

withrelated

(

1

2

3

+

-

18

(3) Calculation of the weights wh

whj = cor(Xhj , Zh)

w11 = cor(Gini , Z1)

w12 = cor(Farm , Z1)

w13 = cor(Rent , Z1)

And the same way for the other whj.


19

Option “1” : All weights are equal to 1.

Option “–1” : All weights equal to 1, except the last one put to –1.

w11,initial = 1

w12,initial = 1

w13,initial = -1

Weight initialization in PLS-graph

This choice allows some sign control:

If the variable with the largest weight is put on last position, this weight will have a good chance to be negative.

20

Economic inequality and political instabilityEstimation of latent variables with PLS Approach

(1) External estimation

Y1 = X1w1

Y2 = X2w2

Y3 = X3w3

(2) Internal estimation

Z1 = Y3

Z2 = -Y3

Z3 = Y1 - Y2

(3) Calculation of wh

w1j = cor(X1j , Z1)

w2j = cor(X2j , Z2)

w3j = cor(X3j , Z3)

Algorithm

• Begin with arbitrary weights w1, w2, w3.

• Get new weights wh by using (1) to (3).

• Iterate until convergence.

21

Use of PLS-Graph (Wynne Chin)

22

Résults

(Corrélation)

Loading = coeff. de régression de Xhj sur Yh ,

= cor(Xhj, Yh) si les X sont centrées-réduites

Outer Model =============================== Variable Weight Loading ------------------------------- Ineg_agri outward gini 0.4567 0.9745 farm 0.5125 0.9857 rent 0.1018 0.5156 ------------------------------- Dev_ind outward gnpr 0.5113 0.9501 labo -0.5384 -0.9551 ------------------------------- Inst_pol outward inst 0.1187 0.3676 ecks 0.2855 0.8241 death 0.2977 0.7910 demostab -0.3271 -0.8635 demoinst 0.0370 0.1037 dictatur 0.2758 0.7227 =================================

23

ResultsEta .. Latent variables======================================== ineg_agr dev_indu inst_pol----------------------------------------arg .964 .238 .755aus 1.204 1.371 -1.617aut .397 .253 -.480bel -.812 1.530 -.846bol 1.115 -1.584 1.505bré .778 -.654 .302...tai -.009 -.898 -.068ru .134 2.059 -1.046eu .193 2.016 -.942uru .699 .179 -1.298ven 1.149 .252 1.135rfa -.212 1.104 -.494you -2.189 -.654 .125========================================

24

Latent variable estimation

Y1 Y2 Y3

Argentine 0.96 0.24 0.75

Australie 1.20 1.37 -1.62

Autriche 0.39 0.25 -0.48

France -0.88 0.80 0.56

Yougoslavie -2.19 -0.65 0.13

3 1 Multiple regression of Y on Y and Y2

R2 = 0.618

Political instability= 0.217Agricultural inequality – 0.692 Industrial development

(2.24) (-7.22)

Student t coming from multiple regression results

PLS results

25

Economic inequality and political instability

GINI

FARM

RENT

GNPR

LABO

Agricultural inequality (X1)

Industrialdevelopment (X2)

ECKS

DEAT

D-STB

D-UNS

INST

DICT

Politicalinstability (X3)

1

2

3

.974

.986

.516

.950

-.955

.368

.824.791

-.864

.104

.723

.217

-.692R2 = 0.618

26

Map of countries : Y1 = agricultural inequality , Y2 = industrial development

Y2 „ƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒ† ‚ ‚ ‚ 2.0 ˆ royaume-uni(1) ** états-unis(1) ˆ ‚ ‚ ‚ ‚ ‚ ‚ ‚ * canada(1) ‚ ‚ ‚ * suisse(1) ‚ ‚ 1.5 ˆ * belgique(1) ‚ ˆ ‚ * suède(1) ‚ australie(1) * ‚ ‚ ‚ * nouv._zélande(1) ‚ ‚ * pays-bas(1) ‚ ‚ ‚ * rfa(2) ‚ 1.0 ˆ * luxembourg(1) ˆ ‚ france(2) ‚ ‚ ‚ * danemark(1) * * norvège(1)‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚ 0.5 ˆ ‚ ˆ ‚ ‚ ‚ ‚ * finlande(2) ‚ * autriche(2) ‚ ‚ ‚ italie(2) * * argentine(2)‚ ‚ * irlande(1) ‚ uruguay(1) *venezuela(3) ‚ 0.0 ˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒˆ ‚ ‚ ‚ ‚ ‚ * cuba(3) ‚ ‚ * pologne(3) ‚ chili(2) * ‚ ‚ * japon(2) ‚ * panama(3) * colombie(2) ‚-0.5 ˆ ‚ grèce(2) * * * costa-rica(2)ˆ ‚ * yougoslavie(3) nicaragua(3)* Espagne(3)*brésil(2) ‚ ‚ ‚ salvador(3)* * * équateur(3) ‚ ‚ * philippines(3) rép_dominic.(3) ‚ ‚ taiwan(3) * guatémala(3) * ‚-1.0 ˆ ‚ pérou(3) * * irak(3) ˆ ‚ sud_vietnam(3) * ** honduras(3) ‚ ‚ ‚ égypte(3) ‚ ‚ ‚ ‚ ‚ * libye(3) ‚-1.5 ˆ * inde(1) ‚ ˆ ‚ ‚ bolivie(3) * ‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚ ‚-2.0 ˆ ‚ ˆ ‚ ‚ ‚ ŠƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒƒƒƒƒƒƒƒˆƒƒŒ -2.5 -2.0 -1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

Y1

27

Results

Inner Model ======================= Block Mult.RSq ----------------------- Inégalit 0.0000 Développ 0.0000 Instabil 0.6180========================

Path coefficients ======================================== Ineg_agr Dev_indu Inst_pol ---------------------------------------- Ineg_agr 0.000 0.000 0.000 Dev_indu 0.000 0.000 0.000 Inst_pol 0.217 -0.692 0.000 ========================================

Correlations of latent variables ======================================== Ineg_agr Dev_indu Inst_pol ---------------------------------------- Ineg_agr 1.000 Dev_indu -0.309 1.000 Inst_pol 0.431 -0.759 1.000 ========================================

28

Results

Outer Model ======================================================== Variable Weight Loading Communality Redundancy -------------------------------------------------------- Ineg_agri outward gini 0.4567 0.9745 0.9496 0.0000 farm 0.5125 0.9857 0.9716 0.0000 rent 0.1018 0.5156 0.2659 0.0000 -------------------------------------------------------- Dev_indu outward gnpr 0.5113 0.9501 0.9027 0.0000 labo -0.5384 -0.9551 0.9123 0.0000 -------------------------------------------------------- Inst_pol outward inst 0.1187 0.3676 0.1352 0.0835 ecks 0.2855 0.8241 0.6792 0.4197 death 0.2977 0.7910 0.6257 0.3867 demostab -0.3271 -0.8635 0.7457 0.4608 demoinst 0.0370 0.1037 0.0107 0.0066 dictatur 0.2758 0.7227 0.5223 0.3228 ========================================================

Communality = Cor(Xhj, Yh)2 = Loading2

For endogenous LV : Redundancy = Cor2(Xhj, Yh)*R2(Yh, LVs explaining Yh)

Average= 0.28

29

Résultats Inner Model =========================================================== Block Mult.RSq AvCommun AvRedund Goodness of Fit ----------------------------------------------------------- Ineg_agri 0.0000 0.7290 0.0000 Dev_indu 0.0000 0.9075 0.0000 Inst_pol 0.6180 0.4531 0.2800 ----------------------------------------------------------- Average 0.6180 0.6110 0.2800 .614 ===========================================================

= Average Variance of Xh explained by Yh

= AVEh

hp2

h hj hj 1h

1(Average communality) Cor (X ,Y )

p

A latent variable must explain at least 50% of its block variance.

Average Communality = (3*AvCommun1 + 2*AvCommun2 + 6*AvCommun3)/11

Value of the externalmodel

Value of theinternal model .618 .611GoF

30

A global index of model fitPLS Goodness of Fit

Inner Model =================================================== Block Mult.RSq AvCommun AvRedund GoF --------------------------------------------------- Ineg_agri 0.0000 0.7290 0.0000 Dev_indu 0.0000 0.9075 0.0000 Inst_pol 0.6180 0.4531 0.2800 --------------------------------------------------- Average 0.618 0.6110 0.2800 .614 ===================================================

hp2 2

h h hj hh h j 1

GoF

1 1R (Y ,Other LVs explaining Y * Cor (X ,Y )

Nb of endogenous LVs Nb of MVs

Outermodel

Innermodel

31

Discriminant validity

A LV explains more its own MVs than the other LVs

AVE and square correlations

========================================

Ineg_agr Dev_indu Inst_pol

----------------------------------------

Ineg_agr 0.729

Dev_indu 0.095 0.907

Inst_pol 0.186 0.576 0.453

========================================

AVE(Yj) must be larger than the cor2(Yj,Yh) for all h

????

32

Using PLS-Graph

(t=1.705)

(t=-7.685)

t coming from bootstrap re-sampling

33

Bootstrap validation in PLS-GraphSign control: Individual sign changes / Construct level changes*

Outer Model Loadings:==================================================================== Entire Mean of Standard T-Statistic sample subsamples error estimateInégalité agricole: gini 0.9745 0.9584 0.0336 28.9616 farm 0.9857 0.9689 0.0329 29.9339 rent 0.5156 0.4204 0.2462 2.0946

Développement industriel: gnpr 0.9501 0.9489 0.0121 78.3692 labo -0.9551 -0.9536 0.0107 -89.1493

Instabilité politique: inst 0.3676 0.3347 0.1756 2.0932 ecks 0.8241 0.8138 0.0699 11.7920 demostab -0.8635 -0.8520 0.0667 -12.9419 demoinst 0.1037 0.0955 0.1611 0.6438 dictatur 0.7227 0.7195 0.0841 8.5915 death 0.7910 0.7977 0.0528 14.9773====================================================================

(*) used here

34

Bootstrap validation in PLS-GraphSign control

Individual sign changes

Construct level changes (Default)

Each bootstrapped sign weight is automatically put equal

to the full sample sign weight.

For each LV (Construct) the weights are globally inversed

if the new loadings (after inversion) are closer to the full

sample loadings than the bootstrapped loadings (before

inversion).

35

PLS-Graph : Bootsrap Validation Path Coefficients Table (Entire Sample Estimate):==================================================================== Inég. Agric. Dev. Indust. Instab. Pol. Inég. Agric. 0.0000 0.0000 0.0000Dev. Indust. 0.0000 0.0000 0.0000Inst. Pol. 0.2170 -0.6920 0.0000====================================================================

Path Coefficients Table (Mean of Subsamples):==================================================================== Inég. Agric. Dev. Indust. Instab. Pol. Inég. Agric. 0.0000 0.0000 0.0000Dev. Indust. 0.0000 0.0000 0.0000Inst. Pol. 0.2328 -0.6743 0.0000====================================================================

Path Coefficients Table (Standard Error):==================================================================== Inég. Agric. Dev. Indust. Instab. Pol.Inég. Agric. 0.0000 0.0000 0.0000Dev. Indust. 0.0000 0.0000 0.0000Instabil 0.1272 0.0900 0.0000====================================================================

Path Coefficients Table (T-Statistic)==================================================================== Inég. Agric. Dev. Indust. Instab. Pol. Inég. Agric. 0.0000 0.0000 0.0000Dev. Indust. 0.0000 0.0000 0.0000Inst. Pol. 1.7054 -7.6855 0.0000====================================================================

36

SPECIAL CASES OF PLS PATH MODELLING

• Principal component analysis• Multiple factor analysis• Canonical correlation analysis• Redundancy analysis• PLS Regression• Generalized canonical correlation analysis (Horst)• Generalized canonical correlation analysis (Carroll)

37

Options of the PLS algorithm

External estimation

Yj = Xjwj

Mode A (for reflective) :

wjh = cor(Xjh , Zj)

Mode B (for formative) :

wj = (Xj´Xj)-1Xj´Zj

Internal estimation

Centroid scheme

eji = sign of cor(Yi,Yj)

Factorial scheme

eji = cor(Yi,Yj)

Path weighting scheme

eji = regression coeff. in the

regression of Yj on the Yi’s

ijij YeZ

38

The general PLS algorithm

wj

Choice of weights e:Centroid, Factorialor Path weighting scheme

Initialstep

Yj = XjwjOuter

Estimation(standardized) Yj2

Yj1

Yjm

Zj

ej1

ej2

ejm

Innerestimation

Mode A: wj =

Mode B: wj =

1'j jX Z

n

11 1( ' ) ( ' )j j j jX X X Zn n

w cor(X,Z)

Look at the loading, not at the w

Some modified multi-block methods for SEM

S U M C O R ( H o r s t , 1 9 6 1 ) ,

( , )j k j kj kM a x c C o r F F

M a t h e s ( 1 9 9 3 ) , H a n a f i ( 2 0 0 4 ) : 2

,( , ) j k j kj k

M a x c C o r F F

M a t h e s ( 1 9 9 3 ) , H a n a f i ( 2 0 0 4 ) ,

| ( , ) | j k j kj kM a x c C o r F F

M A X B E T

( V a n d e G e e r , 1 9 8 4 ) : A l l 1

[ ( ) ( , ) ]j

j j j k j j k kj j kwM a x V a r X w c C o v X w X w

M A X D I F F

( V a n d e G e e r , 1 9 8 4 ) : A l l 1

[ ( , ) ]j

j k j j k kj kwM a x c C o v X w X w

PLS : B, Centroid

PLS: B, Factorial

cjk = 1 if blocks are linked, 0 otherwise

PLS : A, HorstNEW APPROACH

PLS: B, Horst (New)

MAXDIFF B(Hanafi & Kiers, 2006)

2

1( , )

iij i i j j

All wi j

Max c Cov X w X w PLS : A, Factorial

NEW APPROACH

40

PLS approach : 2 blocks

X1 X2

Mode for weight calculation

Y1 = X1w1 Y2 = X2w2 Method Deflation

A A PLS regression of X2 on X1 On X1 only

B A Redundancy analysis of X2 with respect to X1 On X1 only

A A Tucker Inter-Battery Factor Analysis On X1 and X2

B B Canonical correlation Analysis On X1 and X2

(*)

(*) Deflation: Working on residuals of the regression of X on the previous LV’s in order to obtain orthogonal LV’s.

41

PLS regression (2 components)

dim 1

dim 2

- Mode A for X

- Mode A for Y

- Deflate only X

1

1

( , )

( , )* ( ) * ( )

a b

a b

Max Cov Xa Yb

Max Cor Xa Yb Var Xa Var Yb

42

-4

-3

-2

-1

0

1

2

3

-4 -3 -2 -1 0 1 2 3 4

t[2]

t[1]

australie

belgique

canada

danemark

inde

irlande

luxembourg

pays-bas

nouvelle zélande

norvège

suède

suisse

royaume-uni

états-unis

uruguay

autriche

brésil

japon

argentine

autriche

brésil

chili

colombie

costa-rica

finlande

france grèce

italie

japon

West Germany bolivie

cuba

rép. dominicaine

équateur

égypte

salvadorguatémala

honduras

irak

libye

nicaraguapanama

pèrou

philippines

pologne

sud vietnam

espagne

taiwan

venezuela

yougoslavie

PLS Regression in SIMCA-P : PLS Scores

43

-1.00

-0.80

-0.60

-0.40

-0.20

0.00

0.20

0.40

0.60

0.80

1.00

-1.00 -0.80 -0.60 -0.40 -0.20 0.00 0.20 0.40 0.60 0.80 1.00

pc(c

orr)

[Com

p. 2

]

pc(corr)[Comp. 1]

GINI

FARM

RENT

GNPR

LABO

INST

ECKS

DEAT

DEMOSTB

DEMOINST

DICTATURE

Correlation loadings

44

Redundancy analysis of X on Y (2 components)

- Mode A for X

- Mode B for Y

- Deflate only X

( ) 1

( ) 1

( , )

( , )* ( )

a Var Yb

a Var Yb

Max Cov Xa Yb

Max Cor Xa Yb Var Xa

dim 1

dim 2

2

( ) 1 ( , )j

Var Ybj

Max Cor x Yb

45

Inter-battery factor analysis (2 components)

dim 1

dim 2

- Mode A for X

- Mode A for Y

- Deflate both

X and Y

1

1

( , )

( , )* ( ) * ( )

a b

a b

Max Cov Xa Yb

Max Cor Xa Yb Var Xa Var Yb

46

Canonical correlation analysis (2 components)

dim 1

dim 2

- Mode B for X

- Mode B for Y

- Deflate both

X and Y

( ) ( ) 1 ( , )

Var Xa Var YbMax Cov Xa Yb

47

PLS approach : K blocks

X1

XK

.

.

.

X1

.

.

.

XK

X

Deflation: On the super-block only

Scheme for internal estimation calculation

Mode for weightcalculation Centroid Factorial Structural

A

- PCA of X- Multiple Factor

Analysis of the Xj’s- ACOM

BGeneralized CanonicalCorrelation Analysis(Horst)

Generalized CanonicalCorrelation Analysis(Carroll)

(Chessel & Hanafi)

NEW ! NEW !

NEW !

Scheme for internal estimation calculation

Mode for weightcalculation Centroid Factorial Structural

A

- PCA of X- Multiple Factor

Analysis of the Xj’s- ACOM

BGeneralized CanonicalCorrelation Analysis(Horst)

Generalized CanonicalCorrelation Analysis(Carroll)

(Chessel & Hanafi)

NEW ! NEW !

NEW !

48

A new PLS algorithmArthur & Michel Tenenhaus

cij = 1 if blocks are linked, 0 otherwise

,

,

2

,

Horst scheme : ( , )

Centroid scheme : ( , )

Factorial scheme : ( , )

ij i i j ji j

ij i i j ji j

ij i i j ji j

Maximize c Cov X a X a



2

subject to the following constraints :

1 ( ) 1i i i i ia Var X a 2

For = 1 Mode A : 1

For = 0 Mode B : ( ) 1i i

i i i

a

Var X a

49

CONCLUSION

• PLS IS TO COVARIANCE-BASED SEM AS PRINCIPAL COMPONENT ANALYSIS IS TO FACTOR ANALYSIS.

• WHEN INDIVIDUAL DATA ARE AVAILABLE, SIGNIFICANCE TESTS CAN BE CARRIED OUT WITH PLS BY CROSS VALIDATION METHODS.

50

Michel TenenhausMichel [email protected]@hec.fr

European Customer Satisfaction Index

PLS Path Modelling versus LISREL

51

The European Customer Satisfaction Index (ECSI)

• ECSI is an economic indicator that measures customer satisfaction.

• It is an adaptation of the Swedish Customer Satisfaction Barometer and the American Customer Satisfaction Index (ACSI) proposed by Claes Fornell.

• Fornell’s methodology is presented.

52

Path model describing causes and consequences of Customer Satisfaction

.

Image

Perceivedvalue

CustomerExpectation

Perceivedquality

Loyalty


Complaints

Full model in red and blue, Reduced model in red

53

Content of the presentation

• Use of Fornell’s methodology on the full ECSI model

• Use of Fornell’s methodology on the reduced model

• Use of SEM-ML on the reduced model (SEM-ML did not work on the full model)

• Comparison between PLS and SEM-ML results on the reduced model

54

a) Expectations for the overall quality of“your mobil phone provider” at themoment you became customerof this provider.

b) Expectations for “your mobile phoneprovider” to provide products andservices to meet your personal need.

c) How often did you expect that thingscould go wrong at “your mobile phoneprovider” ?

Measurement Instrument for the Mobile Phone Industry : Examples of latent and manifest variables

Customer expectation Customer satisfaction

a) Overall satisfaction

b) Fulfilment of expectations

c) How well do you think “ your mobile phone provider” compares with your ideal mobil phone provider ?

55

Measurement Instrument for the Mobile Phone Industry : Examples of latent and manifest variables

Customer loyaltya) If you would need to choose a new mobile phone provider how likely is it that you would choose “your provider” again ?

b) Let us now suppose that other mobile phone providers decide to lower fees and prices, but “your mobile phone provider” stays at the same level as today. At which level of difference (in %) would you choose another phone provider ?

c) If a friend or colleague asks you for advice, how likely is it that you would recommend “your mobile phone provider” ?

And so on for the other latent variables ...

56

• Each latent variable is estimated as a weighted average of its manifest variables.

• PLS Path modeling is used to estimate the weights with Mode A and Centroid scheme options.

• Path coefficients are computed by multiple regression on the estimated latent variables and t-statistics by cross-validation (bootstrap).

1009

1Vx

I. Study of the complete model usingthe Fornell’s approach

• Manifest variables V are transformed from a scale “ 1-10 ” to a scale “ 0 -100 ” :

57

Use of PLS-Graph (Wynne Chin)

58

Results : The weightsOuter Model ====================== Variable Weight ---------------------- Image outward IMAG1 0.0147 IMAG2 0.0127 IMAG3 0.0137 IMAG4 0.0177 IMAG5 0.0143 ---------------------- Expectat outward CUEX1 0.0232 CUEX2 0.0224 CUEX3 0.0252 ---------------------- Per_Qual outward PERQ1 0.0098 PERQ2 0.0085 PERQ3 0.0118 PERQ4 0.0094 PERQ5 0.0084 PERQ6 0.0095 PERQ7 0.0129 ----------------------

Outer Model ====================== Variable Weight ---------------------- Per_Valu outward PERV1 0.0239 PERV2 0.0247 ---------------------- Satisfac outward CUSA1 0.0158 CUSA2 0.0231 CUSA3 0.0264 ---------------------- Complain outward CUSCO 0.0397 ---------------------- Loyalty outward CUSL1 0.0185 CUSL2 0.0061 CUSL3 0.0225 ======================

59

Fornell’s computation of the latent variables

Example : Customer Satisfaction Index

0.0158 CUSA1 0.0231 CUSA2 0.0264 CUSA3CSI

0.0158 0.0231 0.0264

M ean an d stan d ard d ev ia tion o f th e la ten t variab les

250 26.49 100.00 72.6878 13.7660

250 25.85 100.00 72.3198 14.1259

250 23.95 100.00 74.5765 14.2573

250 .00 100.00 61.5887 20.5987

250 23.68 100.00 71.2876 15.3417

250 .00 100.00 67.4704 25.2684

250 1.29 100.00 69.1757 21.2668

IMAGE

CUSTOMER EXPECTATION

PERCEIVED QUALITY

PERCEIVED VALUE

CUSTOMER SATISFACTION

COMPLAINT

LOYALTY

N Minimum Maximum Mean Std. Deviation

60

ImageCustomer

expectationPerceived

qualityPerceived

valueCustomer

satisfaction Complaint LoyaltyImage1Image2Image3Image4Image5

.717

.565

.657

.791

.698

.571

.571

.544

.539

.543

.500C_exp1C_exp2C_exp3

.689

.644

.724P_qual1P_qual2P_qual3P_qual4P_qual5P_qual6P_qual7

.622.

.621.

.599

.551

.596

.537 .778.651.801.760.732.766.803 .547

.661

.651

.587

.516

.539

.707P_val1P_val2 .541 .594

.933

.911 .631 .524C_sat1C_sat2C_sat3

.558

.524

.613

.638

.672

.684 .588

.711

.872

.884 .547 .610Complaint .537 .540 1Loyalty1Loyalty2Loyalty3 .528 .537 .659

.854

.869

Correlations between manifest variables and latent variables

Correlations below 0.5 in absolute value are not shown.

61

Image

Perceivedvalue

CustomerExpectation

Perceivedquality

Loyalty


Complaint

.492 (7.67)

R2=.242

.544 (10.71)

.066 (1.10)

.037 (1.14)

.153 (3.07)

.211 (2.54)

.541(6.93)

.543 (8.62)

.201 (3.59)

.468(5.18)

.540(11.08)

.049 (1.11)

R2=.296

R2 =.335 R2=.672

R2=.432

R2=.292

ECSI Path model for a“ Mobile phone provider” Regression on standardized variables and t-statistics provided

by PLS-Graph bootstrap, construct level change option

62

Perceivedvalue

CustomerExpectation

Perceivedquality

LoyaltyCustomersatisfaction

.545 (8.92)

.070 (1.08)

.053 (1.20)

.

.538(6.59)

.638 (3.70)

.216 (12.35) .634 (11.50)

R2 = .402

R2=.297

R2=.335 R2=.660

II. Study of the reduced model using Fornell’s approach

63

The new PLS weights

Weight RelativeWeight

CE1CE2CE3

.0237

.0206

.0262

.336

.292

.372PQ1PQ2PQ3PQ4PQ5PQ6PQ7

.0098

.0085

.0118

.0094

.0084

.0095

.0129

.139

.121

.168

.134

.119

.135

.183

Weight RelativeWeight

PV1PV2

-.0239-.0247

.492

.508CS1CS2CS3

.0157

.0240

.0256

.241

.368

.392CL1CL2CL3

-.0188-.0050-.0226

.405

.108

.487

For each variable the relative weights sum up to 1.

64

III. Study of the reduced model using AMOS: Model 1 (Standardized Results)

CUS_EXP

.30

CE1

e1

.55

.21

CE2

e2

.46

.18

CE3

e3

.42

.74

PER_QUAL

.60

PQ1

e4

.77

.33

PQ2

e5

.56

PQ3

e6

.50

PQ4

e7

.48

PQ5

e8

.50

PQ6

e9

.71

.57

PQ7

e10

.46

PER_VAL.89

PV2e12

.94

.55

PV1e11 .74

.87

CSI

.64

CS3e15

.80

.57

CSI2e14.75

.48

CSI1e13

.65

CUS_LOY

.75

CL3e18

.86

.01

CL2e17.12

.39

CL1e16 .63

.70

.78

.24

.80

.86

-.13d1

d2

d3

d4

.57 .75 .69.71.76

.04.72

Chi-Square = 271DF = 128Chi-Square /DF = 2.12 (.63)

(.83)

65

Reduced model 2 (Standardized results)

Chi-Square = 271DF = 130Chi-Square /DF = 2.08RMSEA = 0.066H0: RMSEA 0.05 : p-value = 0.01

CUS_EXP

.30

CE1

e1

.55

.21

CE2

e2

.45

.18

CE3

e3

.43

.73

PER_QUAL

.60

PQ1

e4

.77

.33

PQ2

e5

.56

PQ3

e6

.50

PQ4

e7

.48

PQ5

e8

.50

PQ6

e9

.71

.57

PQ7

e10

.45

PER_VAL.89

PV2e12

.94

.55

PV1e11 .74

.87

CSI

.64

CS3e15

.80

.57

CSI2e14.75

.48

CSI1e13

.65

CUS_LOY

.75

CL3e18

.86

.01

CL2e17.12

.39

CL1e16 .63

.70

.67

.25

.80

d1

d2

d3

d4

.58 .75 .69.71.76

.75

.86

66

Reduced model 2 (Unstandardized results)

97.94

CUS_EXP

CE1

225.47

e1

1.00

1

CE2

313.57

e2

.91

1

CE3

445.07

e3

1.00

1

PER_QUAL

PQ1

99.30

e4

1.00

1

PQ2

294.34

e51

PQ3

178.38

e61

PQ4

167.50

e71

PQ5

135.99

e81

PQ6

163.63

e9

1.05

1

PQ7

179.21

e10

PER_VAL

PV2

47.08

e12

1.07

1

PV1

262.28

e11 1.001

CSI

CS3

134.06

e15

1.63

1

CSI2

166.09

e141.551

CSI1

96.73

e131

CUS_LOY

CL3

152.38

e18

1.15

1

CL2

977.28

e17.20

1

CL1

529.56

e16 1.001

1.00

.99

.13

1.56

1.06 39.71

d1

177.30

d2

12.01

d3

120.31

d4

1

1

1

1

.99 1.24 .911.061.26

1

.58

Chi-Square = 271.118df = 130

Chi-Square/df = 2.086rmsea = .066

p-value (rmsea =< .05) = .010

67

Specific estimation of the latent variables

• Each latent variable is estimated as a weighted average of its own manifest variables, using the loadings hj .

• For example

is the Customer Satisfaction Index score.

• Each coefficient 4j is the regression coefficient of 4 in the

regression relating the manifest variable X4j to its latent variable

4 (similar to PLS weight estimation when mode A is used).

41 42 434

41 42 43

CUSA1 CUSA2 CUSA3Y

68

Loadings and LISREL weights

Loading Weight

CE1 CE2 CE3

1.000 0.913 1.004

.343

.313

.344 PQ1 PQ2 PQ3 PQ4 PQ5 PQ6 PQ7

1.000 0.988 1.241 1.061 0.911 1.045 1.265.

.133

.132

.165

.141

.121

.139

.168

Loading Weight PV1 PV2

1.000 1.069

.483

.517 CS1 CS2 CS3

1.000 1.549 1.634

.239

.370

.391 CL1 CL2 CL3

1.000 0.202 1.155

.424

.086

.490

69

Comparison between the PLS and LISREL weights

PLS RELATIVE WEIGHT

.6.5.4.3.2.1

LIS

RE

L W

EIG

HT

.6

.5

.4

.3

.2

.1

0.0

70

Correlations between the PLS latent variables and the specific LISREL latent variables

CUS_EXP (LISREL)

12010080604020

CU

S_E

XP

(P

LS

)

3

2

1

0

-1

-2

-3

-4

PER_QUAL (LISREL)

12010080604020

PE

R_Q

UA

L (P

LS

)

2

1

0

-1

-2

-3

-4

PER_VAL (LISREL)

120100806040200-20

PE

R_VA

L (P

LS

)

2

1

0

-1

-2

-3

-4

CSI (LISREL)

12010080604020

CS

I (P

LS

)

2

1

0

-1

-2

-3

-4

CUS_LOY (LISREL)

1 2 01 0 08 06 04 02 00-2 0

CU

S_LO

Y (P

LS

)

2

1

0

-1

-2

-3

-4

All the correlationsare above .998

71

First conclusions

• If COV-BASED SEM works, the PLS results can be derived from the COV-BASED SEM results.

• If COV-BASED SEM does not work, PLS is still an alternative.

• If COV-BASED SEM is not adequate (small number of observations and/or large number of variables) PLS can be used for exploratory purposes.

72

Usual estimation of latent variables in LISREL

Proc calis covariance modification data =ecsi outstat=a; lineqs CUEX1 = 1 f1 + e1,

CUEX2 = Lambda12 f1 + e2, CUEX3 = Lambda13 f1 + e3,

.

.

. CUSL1 = 1 f5 + e16,

CUSL2 = Lambda52 f5 + e17, CUSL3 = Lambda53 f5 + e18,

f2 = beta21 f1 + d2,f3 = beta31 f1 + beta32 f2 + d3,f4 = beta41 f1 + beta42 f2 + beta43 f3 + d4,f5 = beta54 f4 + d5;

std e1-e18 = vare1-vare18,

d2-d5 = vard2-vard5,f1 = varf1;

var CUEX1 CUEX2 CUEX3 PERQ1 PERQ2 PERQ3 PERQ4 PERQ5

PERQ6 PERQ7 PERV1 PERV2 CUSA1 CUSA2 CUSA3 CUSL1CUSL2 CUSL3;

run;proc print data=a (where = (_type_="SCORE"));run;

73

Variable weights for the usual estimation of the latent variables in LISREL

CUEX1 CUEX2 CUEX3 PERQ1 PERQ2 PERQ3 f1 0.11102 0.074334 0.055776 0.07362 0.024507 0.050785 f2 0.03242 0.021705 0.016287 0.12987 0.043233 0.089590 f3 -0.00083 -0.000558 -0.000418 0.01235 0.004112 0.008522 f4 0.01321 0.008842 0.006634 0.04578 0.015241 0.031583 f5 0.00906 0.006064 0.004550 0.03140 0.010453 0.021661

PERQ4 PERQ5 PERQ6 PERQ7 PERV1 PERV2 CUSA1

0.046274 0.049023 0.046702 0.051524 -0.00071 -0.00440 0.0308500.081633 0.086483 0.082388 0.090894 0.00466 0.02873 0.0470980.007765 0.008227 0.007837 0.008646 0.10833 0.66864 0.0270740.028778 0.030488 0.029044 0.032043 0.00992 0.06122 0.0932210.019737 0.020910 0.019920 0.021977 0.00680 0.04199 0.063936

CUSA2 CUSA3 CUSL1 CUSL2 CUSL3

0.027752 0.03606 0.00387 0.000423 0.015330.042369 0.05506 0.00591 0.000645 0.023400.024356 0.03165 0.00340 0.000371 0.013450.083861 0.10898 0.01170 0.001277 0.046320.057516 0.07474 0.10838 0.011830 0.42895

74

Correlations between the PLS latent variables and the usual estimated LISREL latent variables

CUS_EXP (LISREL)

20100-10-20-30-40

CU

S_

EX

P (

PL

S)

3

2

1

0

-1

-2

-3

-4 Rsq = 0.6227

PER_QUAL (LISREL)

3020100-10-20-30-40-50

PE

R_

QU

AL (

PL

S)

2

1

0

-1

-2

-3

-4 Rsq = 0.9599

PER_VAL (LISREL)

40200-20-40-60

PE

R_

VA

L (

PL

S)

2

1

0

-1

-2

-3

-4 Rsq = 0.9010

F4

20100-10-20-30-40

CS

I

2

1

0

-1

-2

-3

-4 Rsq = 0.8981

F5

40200-20-40-60

CU

S_

LO

Y

2

1

0

-1

-2

-3

-4 Rsq = 0.8592

75

Final conclusions

• COV-BASED SEM did not work on the full model.• COV-BASED SEM gives better results for the inner model

(relating the latent variables between them) because the latent variables are space-free.

• PLS gives better results for the outer model (relating the manifest variables to their latent variables) because each latent variable is constrained to be in its own manifest variables space.

• If each COV-BASED SEM latent variable is estimated as a weighted average of its own manifest variables, then COV-BASED SEM and PLS give almost identical latent variable estimates (at least on the examples we have studied).

76

PLS Path Modeling and Multiple Table Analysis

Application to the study of the cosmetic habits of women in Ile-de-France

Christiane Guinot (CERIES/CHANEL)

& Michel Tenenhaus (HEC)

77

Objective of the analysis

We have applied the PLS approach to a studyof the cosmetic habits of women living inIle-de-France

The aim of the project was to obtain a global score describing the propensity to use cosmetic productsin this sample

Then, we used behavioural and skin characteristic variables, which are known to account for the variation in use of cosmetic products, to check onthe relevance of this score

78

The cosmetic products were divided intofour blocks corresponding to different cosmetic practices Sun

care

Bodycare

Facecare

make-up and eye make-up removers, tonic lotions, day creams, night creams exfoliation products

sun protection products for face and for body after-sun products for face and for body

MakMake-upe-up

blushers, mascaras, eye shadows, eye pencils, lipsticks, lip shiners and nail polish

Data

soap, liquid soap, moisturising body cream, hand creams

79

Construction of a global score

Manifestvariables

Cosmeticpractices

LatentVariable (A)

Globalscore

PartialScores

LatentVariables (B)

Body careBody care

Face careFace care

Make-upMake-up

Sun careSun care

11

22

33

44

Body careBody care

Face Face carecare

Make-Make-upup Sun careSun care

Inner model :

centroid scheme

80

Propensity to usecosmetic products

Globalscore

Results

11

22

33

44

-.24 .47 .80 .56

.44

.56

.56

.46

.55

.57

.57

.72

.58

.43

.53

.47

.64

.73

.71

.78

-.12

.23 .40 .28 .32 .40 .40 .33 .39 .41 .39 .49 .39 .29 .36 .31

.39 .45 .44 .49

.27

.42

.43

.44

Facecare

MakMake-upe-up

Sun care

soap bodyliquid soapbody creamhand cream

Bodycare

make-up rem.tonic lotioneye m.up rem. day creamnight creamexfoliation pdt

protec faceafter sun faceprotec bodyafter sun body

blushermascara eye shadow eye pencil lipsticknail polish

Soap bodyliquid soapbody creamhand creammake-up removertonic lotioneye m.up remover day creamnight creamexfoliation pdtblushermascara eye shadow eye pencil lipsticknail polishsun protec. faceafter sun facesun protec. bodyafter sun body

Correlations

Regressioncoefficients

81

Global score Global score = -3.40- .11 * soaps and toilet soaps for body care

+.20 * liquid soaps for body care+.38 * moisturising body creams and milks+.25 * hand creams and milks+.21 * make-up removers+.26 * tonic lotions +.30 * eye make-up removers+.39 * moisturising day creams+.30 * moisturising night creams+.30 * exfoliation products+.26 * blushers+.41 * mascara+.26 * eye shadows+.20 * eye pencils+.33 * lipsticks and lip shiners+.20 * nail polish+.36 * sun protection products for the face+.31 * moisturising after sun products for the face+.38 * sun protection products for the body+.34 * moisturising after sun products for the body

Result : Global score

82

Results: partial scores

Score body-careScore body-care Score facial-careScore facial-care

Score make-upScore make-up Score sun-careScore sun-care

83

S_body-care S_facial-care S_make-up S_sun-care

S_facial-care 0.24001 S_make-up 0.13462 0.35035 S_sun-care 0.16500 0.19075 0.14273

S_global S_global 0.50263 0.71846 0.67347 0.62071 0.50263 0.71846 0.67347 0.62071

Result : global score

84

Relevance of the global scoreFactors influencing the use of cosmetic productsTo identify behavioural and skin characteristic

variables which best account for the variation in the use of cosmetic products, we can relate the global score to the following variables:

Professional activity & Socio-professional

category

Children Sun exposure habits Practice of sport Importance of physical appearance Type of facial skin & type of body skin Age

85

Relevance of the global score

E(Score global) = -1.02+.21 * professional activity

+.07 * housewife or student+.00 * retired+.27 * CSP A (craftsmen, trades people, business

managers, managerial staff,academics and professionals)

+.09 * CSP B (farmers and intermediary professions)+.05 * CSP C (employees and working class people)+.00 * CSP D (retired and non working people)-.21 * without child+.00 * with child +.40 * habits of deliberate exposure to sunlight+.09 * previous habits of deliberate exposure to sunlight+.00 * no habits of deliberate exposure to sunlight-.17 * no sport practised +.00 * sport practised+1.04 * physical appearance is of extreme importance+.89 * physical appearance is of high importance

+.50 * physical appearance is of some importance+.00 * physical appearance is of little importance-.06 * oily facial skin+.16 * combination facial skin-.20 * normal facial skin+.00 * dry facial skin-.32 * oily body skin-.57 * combination body skin-.32 * normal body skin +.00 * dry body skin-.00 * age

86

E(Global score)= -1.02-1.02+.21 * professional activity+.21 * professional activity

+.07 * housewife or student+.00 * retired+.27 * CSP A (craftsmen, trades people, business managers, +.27 * CSP A (craftsmen, trades people, business managers,

managerial managerial staff, academics and professionals) staff, academics and professionals) +.09 * CSP B (farmers and intermediary professions)+.05 * CSP C (employees and working class people)+.00 * CSP D (retired and non working people)- .21 * without child+.00 * with child+.00 * with child +.40 * habits of deliberate exposure to sunlight+.40 * habits of deliberate exposure to sunlight+.09 * previous habits of deliberate exposure to sunlight+.00 * no habits of deliberate exposure to sunlight- .17 * no sport practised +.00 * sport practised+.00 * sport practised+1.04 * physical appearance is of extreme importance+1.04 * physical appearance is of extreme importance+.89 * physical appearance is of high importance

+.50 * physical appearance is of some importance+.00 * physical appearance is of little importance- .06 * oily facial skin+.16 * combination facial skin+.16 * combination facial skin- .20 * normal facial skin+.00 * dry facial skin- .32 * oily body skin- .57 * combination body skin- .32 * normal body skin +.00 * dry body skin+.00 * dry body skin- .00 * age

1.061.06

A good profile

87

E(Global score)= -1.02-1.02+.21 * professional activity

+.07 * housewife or student+.00 * retired+.00 * retired+.27 * CSP A (craftsmen, trades people, business managers,

managerial staff, academics and professionals)

+.09 * CSP B (farmers and intermediary professions)+.05 * CSP C (employees and working class people)+.00 * CSP D (retired and non working people)+.00 * CSP D (retired and non working people)- .21 * without child- .21 * without child+.00 * with child +.40 * habits of deliberate exposure to sunlight+.09 * previous habits of deliberate exposure to sunlight+.00 * no habits of deliberate exposure to sunlight+.00 * no habits of deliberate exposure to sunlight- .17 * no sport practised - .17 * no sport practised +.00 * sport practised+1.04 * physical appearance is of extreme importance+.89 * physical appearance is of high importance

+.50 * physical appearance is of some importance+.00 * physical appearance is of little importance+.00 * physical appearance is of little importance- .06 * oily facial skin+.16 * combination facial skin- .20 * normal facial skin- .20 * normal facial skin+.00 * dry facial skin- .32 * oily body skin- .57 * combination body skin- .57 * combination body skin- .32 * normal body skin +.00 * dry body skin- .00 * age

-2.00-2.00

A bad profile

88

Conclusion

Using PLS approach, we obtain a scorepresenting the propensity to use cosmeticproducts by balancing the different typesof cosmetic products better than usingprincipal component analysis.

89

Final conclusion

« All the proofs of a pudding are in the eating, not in the cooking ».

William Camden (1623)

90

Some references on PLS Path Modeling

• CHIN W.W. (2001) : PLS-Graph User’s Guide, C.T. Bauer College of Business, University of Houston, Houston.

• CHIN W.W. (1998) : “The partial least squares approach for structural equation modeling”, in: G.A. Marcoulides (Ed.) Modern Methods for Business Research, Lawrence Erlbaum Associates, pp. 295-336.

• FORNELL C. (1992) : “A National Customer Satisfaction Barometer: The Swedish Experience”, Journal of Marketing, Vol. 56, 6-21.

• FORNELL C. & CHA J. (1994) : “Partial Least Squares”, in Advanced Methods of Marketing Research, R.P. Bagozzi (Ed.), Basil Blackwell, Cambridge, MA., pp. 52-78.

• GUINOT, C., LATREILLE, J. & TENENHAUS M.: “PLS Path Modeling and Analysis of Multiple Tables”, Chemometrics and Intelligent Laboratory Systems, Special issue on PLS methods, 58, 2001 (with C. Guinot and J. Latreille).

• LOHMÖLLER J.-B. (1987) : LVPLS Program Manual, Version 1.8, Zentralarchiv für Empirische Sozialforschung, Köln.

91

• LOHMÖLLER J.-B. (1989) : Latent Variables Path Modeling with Partial Least Squares,

Physica-Verlag, Heildelberg.

• PAGÈS J. & TENENHAUS M. (2001) : "Multiple Factor Analysis and

PLS Path Modeling",

Chemometrics and Intelligent Laboratory Systems, 58, 261-273.

• TENENHAUS M. (1998) : La Régression PLS. Éditions Technip, Paris

• TENENHAUS M. (1999) : “L’approche PLS”, Revue de Statistique Appliquée,

vol. 47, n°2, pp. 5-40.

• TENENHAUS M., ESPOSITO VINZI V., CHATELIN Y.-M., LAURO, C. (2005):

"PLS Path Modeling", Computational Statistics and Data Analysis.

• WOLD H. (1985) : “Partial Least Squares”, in Encyclopedia of Statistical Sciences,

vol. 6, Kotz, S & Johnson, N.L. (Eds), John Wiley & Sons, New York, pp. 581-591.