36
1 Wold-Kettaneh PCA & PLS ASQ , May 98 ANALYZING COMPLICATED DATA SETS by PCA (principal components analysis), and PLS (projections to latent structures) Multivariate SPC (MSPC), and other process applications Svante Wold & Nouna Kettaneh Umeå University, Sweden & Umetrics Inc., NJ, USA

Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

  • Upload
    vukhue

  • View
    216

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

1Wold-Kettaneh PCA & PLS ASQ , May 98

ANALYZING COMPLICATED DATA SETS

by

PCA (principal components analysis), and

PLS (projections to latent structures)

Multivariate SPC (MSPC), and other process applications

Svante Wold & Nouna Kettaneh

Umeå University, Sweden & Umetrics Inc., NJ, USA

Page 2: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

2Wold-Kettaneh PCA & PLS ASQ , May 98

INTRODUCTION

X Processes -- lots of demands

X Quality, high yield, little pollution, ....

X Low cost, high throughput, ...

X Lots of data -- very multivariate (often 1000’s of variables)

– collinear rank of X << K (often 2 to 5)

– noisy

– often inadequate (some essential factors not measured)

– often incomplete (missing data)

– we drown in the data -- or throw most of them away

Page 3: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

3Wold-Kettaneh PCA & PLS ASQ , May 98

PURPOSES of ANALYZING PROCESS DATA

X Information about state of process– OK or not (MSPC, Classification)

– Variables related to faults, upsets, etc.

X Modelling– which are the few dominating relationships ?

X Improvement of the process– which conditions give better results (yield, …) ?

X Easily understandable presentation -- GRAPHICS

Page 4: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

4Wold-Kettaneh PCA & PLS ASQ , May 98

EXAMPLE (secret origin, we apologize)

X N = 92 hourly observations from a “campaign”

X K = 7 + 18 input & intermediate variables (X)

X M = 8 Y-variables (responses)– y6 (impurity)

– y8 (yield) are the two most important

X Serious process problems around time 80– process shut down at time 92

X Could this have been prevented by monitoring theprocess multivariately ?

Page 5: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

5Wold-Kettaneh PCA & PLS ASQ , May 98

Tables are not useful foroverview or understanding

x1in x2in x3in x4in x5in x6in x7in y1 y2 y3 y4 y5 y6* y7 y8* x8m d x9m d xa m d

1 0.47 -1.66 -0.19 1.94 0.07 -4.54 -0.09 -1.13 0.62 0.24 0.61 -0.45 0.32 -0.23 -0.89 0.7 -0.31 0.78

2 0.05 -0.83 0.04 0.75 0.25 -0.02 -0.6 -0.89 0.68 0.14 0.7 0.17 0.37 -0.15 -0.76 0.75 -0.32 0.61

3 -0.58 -0.21 -0.08 0.89 0.34 0.12 -0.86 -0.81 0.77 0.12 0.71 0.13 0.38 -0.11 -0.84 0.77 0.44 0.58

4 -0.9 0.11 0.16 1.43 0.45 0.13 -0.41 -0.7 0.69 0.21 0.61 0.12 0.3 -0.08 -1.09 0.11 0.9 0.78

5 -0.78 -0.33 -0.34 1.51 0.5 -0.01 -0.31 -0.38 0.91 0.07 0.83 0.07 0.52 -0.04 -1.17 0.79 1.13 0.79

6 -0.87 -0.86 -0.6 0.5 0.49 0.03 -0.36 -0.42 0.85 0.13 0.78 0.76 0.45 0 -1.27 0.57 0.88 0.39

7 -0.49 0.48 -0.28 0.55 0.3 -0.1 -0.41 -0.58 0.68 0.21 0.64 0.11 0.35 -0.02 -1.02 0.53 1.2 0.3

8 -0.39 1.1 0.2 -0.31 0.28 -0.1 -0.09 -0.97 0.35 0.28 0.34 0.2 0.12 -0.05 -0.68 0.33 1.56 0.11

9 -0.06 0.95 0.24 -1.08 0.25 0.12 0.11 -0.87 0.5 0.09 0.57 -0.43 0.27 -0.09 -0.82 -0.12 1.1 0.44

10 0.27 0.11 -0.86 -0.87 0.15 0.19 -0.06 -0.62 0.39 0.01 0.45 0.07 0.29 -0.23 -1.12 -0.03 1.09 0.5

11 0.2 0.52 -0.77 -1.08 0.08 0.3 0.08 -0.79 0.24 0.06 0.3 -0.45 0.16 -0.04 -0.69 -0.11 1.18 0.6

12 0.1 0.62 -4.38 -0.88 -0.07 0.21 0.14 -0.9 0.18 0.18 0.2 -0.45 0.1 -2.95 -0.68 -0.11 0.87 0.63

13 -0.38 0.81 -0.27 -1.22 -0.04 0.2 0.05 -0.71 0.11 0.06 0.11 -0.5 0.02 0.04 -0.46 -1.08 -0.03 0.44

14 -0.01 0.72 0.2 -1.04 -0.01 0.3 0.14 -0.7 0.13 0.1 0.18 -0.49 0.12 0.05 0.08 -1 -0.16 0.07

15 -0.62 0.19 -0.06 -0.94 -0.13 0.23 0.01 -0.57 0.14 0.04 0.18 -0.48 0.12 0.07 0.5 -1.01 -0.22 0.07

16 -0.78 0.25 -0.16 -0.72 0.01 0.3 0.1 -0.58 -0.03 -0.08 0.06 -0.49 0.11 0.09 0.9 -1 -0.18 0.09

17 -0.65 -1.86 -1.48 -0.72 0 0.21 0.35 -0.61 -0.18 -0.15 -0.06 -0.39 0.04 0.06 1.5 -1.04 -0.69 0.06

18 -1.14 0.04 0.34 -0.26 0.15 0.21 0.3 -0.53 -0.32 -0.02 -0.23 -1.17 -0.03 0.1 1.11 -1.02 -0.59 0.15

19 -0.59 0.17 0.66 0.57 0.23 0.24 -0.12 -0.51 -0.57 0.06 -0.4 -1.14 -0.14 0.1 1.65 -1.36 -0.94 -0.02

but for data storage and retrieval

Page 6: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

6Wold-Kettaneh PCA & PLS ASQ , May 98

Two complementing ways to analyze and model data

X Detailed (fundamental) models K << N– often based on differential equations

– useful for rather simple systems

– e.g., engineering process control

one response (y), 1-2 predictors (x)

X “Soft” statistical models 0 < K/N < ∞– often based on Taylor (or other) expansions of unknown function

– useful also for complicated systems

– e.g., process monitoring with many variables (e.g., K=6413)

Page 7: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

7Wold-Kettaneh PCA & PLS ASQ , May 98

“Soft” statistical models 0 < K/N < ∞

X Monitoring, control charts– univariate SPC Shewhart, EWMA, CuSum, …

– multivariate MSPC Shewhart, EWMA, CuSum, …

but in “scores” (aggregates)

plus residual based diagnostics

X Complicated relationships– process conditions⇔ results (yield, purity, strength, …)

– composition ⇔ properties (color, strength, …)

– spectra (NIR, ….)⇔ properties (concentrations, energy content)

Page 8: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

8Wold-Kettaneh PCA & PLS ASQ , May 98

Multivariate analysis by means of projections

X Data shaped as a table, X

X Space with K axes (K-space)K = number of variables (col.s)

Each obs. (process time point)

is a point in this space

X Multivariate analysis– finding structures in M-space

– describing them (math & stat)

– using them for problem solving

Page 9: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

9Wold-Kettaneh PCA & PLS ASQ , May 98

First order perturbation theoryX data are approximated by

point, line, plane, or hyper-plane

= multivariate model with

0, 1, 2, 3 or more “components”

coordinates in plane = scores (t)

directions of plane = loadings (p)

distance, data point to plane

= residual SD (DModX)

X graphics (t t, p p, DModX)Classification, Identification,

Quantification

Page 10: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

10Wold-Kettaneh PCA & PLS ASQ , May 98

PCA: “best” approximation (summary) of XLeast squares line or plane; SVD of X; EV of X’X

1. Line through mean point 2. Line orthogonal to first

Page 11: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

11Wold-Kettaneh PCA & PLS ASQ , May 98

Data table X approximated as: X = T P’ + EColumns of T gives score plot. Rows of P’ gives loading plot

Directions in score plot (left) and loading plot (right) correspond

Page 12: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

12Wold-Kettaneh PCA & PLS ASQC, May 98

And what does the score plot show for the example ?

-10 -5 0 5

-6

-4

-2

0

2

4

6

1 2 345

67 8

910

11

12131415

16171819

202122

23 24 25 26272829

3031

323334

35363738

3940414243

4445464748

4950 51 5253 54

555657

58 596061

626364

65666768

697071

72

73

7475 76

7778798081

828384

85

86

878889

90

91 92

Ellipse: Hotelling T2PS (0.05)

Sim

ca-P

3.0

1 b

y U

me

tri

AB

19

98

-02

-10

11

:01

PROC1A PCA X&Y, obs 1-69, 3 signif. comps., obs 70-92 predicted

tPS

[2]

tPS[1]

PCA of“training”data X&Y(69 x 33);rest predicted.

X centered,and scaled tounit variancebefore thePCA

Page 13: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

13Wold-Kettaneh PCA & PLS ASQC, May 98

Example: Cusum chart emphasizing the persistent patternin the score plot (previous page).Cusum chart of 1.st X-score (t1), continued beyond point 69

20 40 60 80

-100

-80

-60

-40

-20

0

20

training set (1-69)

Dead band (K )Dead band (K )

Action lim it (H)

Action lim it (H)

H igh Cusum

Dev. from Target

Low Cusum

S (M 4) = 2.548

Target (M 4) = 0

A i li i (H ) 11 4 D d b d (K ) 1 2 4

Sim

ca-P

3.0

1 b

y U

me

tri

AB

19

98

-02

-10

10

:58

PROC1A PCA X&Y, obs 1-69, 3 signif. comp; CuSum (subgroup 1): M4.tPS[1]

subgroup index

Page 14: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

14Wold-Kettaneh PCA & PLS ASQ , May 98

How does this work ?Process “OK” corresponds toData are close to a plane (or a hyper-plane)

Deviations:

X Away from plane,DModX (Resid.SD)

X In plane:

Scores outside

Hotelling ellipse

X Displayed as ordinarycontrol charts

Page 15: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

15Wold-Kettaneh PCA & PLS ASQC, May 98

Example: What do we see in DModX ?DModX = SD of residuals row-wise (observations)

0 20 40 60 800

1

2

3

4

5

6

1

2345678

910

11

12

13141516171819202122

2324

25262728293031

32

33

34353637383940

41

42

43

444546474849

50

51525354

55565758

596061626364

65

666768

69

70

71

72

73

74

75

76

7778

798081

82

838485

86

8788

8990

91

92

DCrit (0.05)

Dcrit [3] = 1.315 , Normalized distances, Non weighted residuals

Sim

ca

-P 3

.01

by

Um

etr

i A

B 1

99

8-0

2-1

0 1

1:2

2

PROC1A.M4 (PC), PCA X&Y, obs 1-69, 3 signif. comp, DModX(PS)

DM

od

XP

S

Num

Page 16: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

16Wold-Kettaneh PCA & PLS ASQC, May 98

Why is observation 71 so far from the model (hyper plane) ?Contribution plot of residuals row-wiseobservation residuals * weights [w = sqrt(R2(k)) ]

x1

inx

2in

x3

inx

4in

x5

inx

6in

x7

iny

1y

2y

3y

4y

5y

6*

y7

y8

*x

8m

dx

9m

dx

am

dx

bm

dx

cm

dx

dm

dx

em

dx

fmd

xg

nx

xh

nx

xin

xx

jnx

xk

nx

xln

xx

me

nx

ne

nx

oe

nx

pe

n

-8

-6

-4

-2

0

Sim

ca-P

3.0

1 b

y U

me

tri

AB

19

98

-02

-10

11

:34

PROC1A.M4 (PC), PCA X&Y, obs 1-69, 3 signif. comp

Contribution DM odX, Obs71, Xresid scaled, weight=RX

Page 17: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

17

The “contribution plot”: Shows what has happenedin the individual observation (both PCA & PLS)

X A score value (e.g. point 65) is suspect

we look at the data (x65, k - xavgk )

times a weight (pk, or sqrt(Rk2) )

X A residual SD (DModX), e.g., point 71, is suspect

we look at the residuals e71, k

times a weight (Rk2)

X These “contribution” plots identify “culprit” variables

Wold-Kettaneh PCA & PLS ASQ , May 98

Page 18: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

18Wold-Kettaneh PCA & PLS ASQC, May 98

Why is obs. 65 so far to the right in the “normal” area ?Contribution plot of data (here X&Y) row-wise(observation - mean vector)* weights [w = p1]

-02-10 11:49x1

inx2

inx3

inx4

inx5

inx6

inx7

iny1 y2 y3 y4 y5 y6

*y7 y8

*x8

md

x9m

dxa

md

xbm

dxc

md

xdm

dxe

md

xfm

dxg

nx

xhn

xxi

nx

xjn

xxk

nx

xln

xxm

en

xne

nxo

en

xpe

n

-1.0

-0.8

-0.6

-0.4

-0.2

0.0

0.2

Sim

ca

-P 3

.01

by

Um

etr

i A

B 1

99

8-0

2-1

0 1

1:4

9

PROC1A.M4 (PC), PCA X&Y, obs 1-69, 3 signif. comp

Contribution Scores, Obs65-AVG, weight=p, Comp1

Page 19: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

19Wold-Kettaneh PCA & PLS ASQC, May 98

Why is obs. 80 outside “normal” area (Hotelling’s ellipse) ?Contribution plot of data (here X&Y) row-wise(observation - mean vector)* weights [w = p1]

-02-10 11:49x

1in

x2

inx

3in

x4

inx

5in

x6

inx

7in

y1

y2

y3

y4

y5

y6

*y

7y

8*

x8

md

x9

md

xa

md

xb

md

xc

md

xd

md

xe

md

xfm

dx

gn

xx

hn

xx

inx

xjn

xx

kn

xx

lnx

xm

en

xn

en

xo

en

xp

en

-0.5

0.0

0.5

1.0

Sim

ca

-P 3

.01

by

Um

etr

i A

B 1

99

8-0

2-1

0 1

1:5

7

PROC1A.M4 (PC), PCA X&Y, obs 1-69, 3 signif. comp

Contribution Scores, Obs80-AVG, weight=p, Comp1

Page 20: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

20Wold-Kettaneh PCA & PLS ASQ , May 98

Best plots for understanding the process data:Either two dimensional score plot + DModXor, separate plots for score 1 (t1), 2 (t2), …, DModX.

Page 21: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

21Wold-Kettaneh PCA & PLS ASQ , May 98

PLS addresses the relationship predictors X ⇒ responses Y ,by making a model of X, and a connected model of Y

X = T P’ + E

Y = T C’ + F

T = X W*

eigen vectors:

t of XX’YY’

w of X’YY’X

(U & P are

help matrices)

Page 22: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

22Wold-Kettaneh PCA & PLS ASQ , May 98

PLS and multiple y’s -- projection of X and Y.Double objectives: model X & t predict Y

X-score (t) is distance from center (mean point) to projection

Page 23: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

23Wold-Kettaneh PCA & PLS ASQ , May 98

PLS : position in M-space ⇔ properties (Y-space)

X Scores (t) are a goodsummary of X

X t should hence predict Y(properties; yield, kappa, …)

X PLS projects X so that:– X predicts Y

– X is well approximated

– diagnostics as PCA + more

t u plots, coefficients, VIP,….

Page 24: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

24Wold-Kettaneh PCA & PLS ASQ , May 98

Parameters and Diagnostics of PLS same as PCA,but more; two spaces + t u + WC plots + VIP + ...

X Process monitoring:– monitoring of X; t t & DmodX & interpretation

– monitoring of Y; u u & DModY & -”-

t u & residual (t,u)less common, because Y typically is out of sync. with X

X Modelling of complicated relationships,online chemical analysis, soft sensors, … (MVCalib)– coefficients (b), PLS weights (w), predictions, …

Page 25: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

25Wold-Kettaneh PCA & PLS ASQC, May 98

Example again -- now a PLS modelPLS, X-scores: t3 vs. t2 (RX

2 = 0.24 and 0.19)

-10 -8 -6 -4 -2 0 2 4 6

-6

-4

-2

0

2

4

12 34

5

6 78910

11

121314151617

1819

2021

22

23

24

25 262728

293031

323334

35363738

3940

41

42434445

4647484950

51 52

53 54

55565758

59

60

616263

646566

676869

707172

73

7475 7677

78

79

8081

82

838485

86

8788

89

90

91 92

Ellipse: Hotelling T2PS (0.05)

Sim

ca-P

3.0

1 b

y U

me

tri

AB

19

98

-02

-10

13

:38

PROC1A pls, obs 1-69, M3.tPS[3] / M3.tPS[2]

tPS

[2]

tPS[3]

Page 26: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

26Wold-Kettaneh PCA & PLS ASQC, May 98

PLS, DModY (Y-residuals, row-wise SD), A=5

0 20 40 60 800

2

4

6

8

10

12

12345678910

11

12

1314151617181920212223242526272829

303132333435363738

3940414243

44454647484950515253

5455565758596061

626364

65

6667686970

71

72

73

7475767778

79808182838485

868788

89

90

9192

Sim

ca-P

3.0

1 b

y U

me

tri A

B 1

99

8-0

2-1

0 1

3:4

9

PROC1A.M3 (PLS), obs 1-69, DModY(PS) Comp 5 (Cum)

DM

od

YP

S

Num

Page 27: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

27

Contribution plot for DModY, obs. 71

y1

y2

y3

y4

y5

y6

*

y7

y8

*

-14

-12

-10

-8

-6

-4

-2

0

Sim

ca

-P 3

.01

by

Um

etr

i A

B 1

99

8-0

2-2

3 2

2:3

9

PROC1A.M3 (PLS), pls, obs 1-69, OK, PS-PROC1A

Contribution DM odY, Obs71, Yresid scaled, weight=RY, Comp5(Cum)

Page 28: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

28Wold-Kettaneh PCA & PLS ASQC, May 98

-0.6 -0.4 -0.2 0.0 0.2

-0.2

0.0

0.2

0.4

x1in

x2in

x3inx4in

x5in x6inx7in

x8md

x9md

xamdxbmd

xcmd

xdmd

xemd

xfmd

xgnx

xhnx

xinx

xjnxxknx

xlnx

xmen

xnen

xoenxpen

y1

y2

y3

y4

y5y6*

y7

y8*

Sim

ca-P

3.0

1 b

y U

me

tri

AB

19

98

-02

-10

12

:41

PROC1A.M3 (PLS), obs 1-69, w*c[1]/w*c[2]

w*c

[2]

w *c[1]

Loadings (w* and c) for components 1&2 (5)

R2 (modelled)

X Y

10 33

19 14

27 3 7 8 6 3

Page 29: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

29Wold-Kettaneh PCA & PLS ASQC, May 98

-0.4 -0.2 0.0 0.2

-0.2

0.0

0.2

0.4

x1inx2inx3in

x4in

x5in

x6inx7in x8md

x9md

xamd

xbmd

xcmdxdmd

xemd

xfmd

xgnxxhnx

xinx

xjnx

xknx

xlnx

xmen

xnen xoen

xpeny1

y2

y3

y4

y5y6*

y7

y8*

Sim

ca-P

3.0

1 b

y U

me

tri A

B 1

99

8-0

2-1

0 1

2:5

0

PROC1A.M3 (PLS), pls, obs 1-69: w*c[3]/w*c[4]

w*c

[4]

w*c[3]

Loadings (w* and c) for components 3&4 (5)

R2 (modelled)

X Y

10 33

19 14

27 3 7 8

6 3

Page 30: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

30Wold-Kettaneh PCA & PLS ASQC, May 98

PLS regression coefficients, y6 (impurity), y8 (yield)x1

in

x2in

x3in

x4in

x5in

x6in

x7in

x8m

d

x9m

d

xam

d

xbm

d

xc

md

xdm

d

xem

d

xfm

d

xgn

x

xhn

x

xin

x

xjn

x

xkn

x

xln

x

xm

en

xne

n

xoe

n

xpe

n

-0.2

-0.1

0.0

0.1

0.2

0.3

Sim

ca-P

3.0

1 b

y U

me

tri

AB

19

98

-02

-10

13

:02

PROC1A.M3 (PLS), pls, obs 1-69, OK, Workset

CoeffCS, X/Y: y6*, Comp 5(Cum)

x1in

x2in

x3in

x4in

x5in

x6in

x7in

x8m

d

x9m

d

xam

d

xbm

d

xcm

d

xdm

d

xem

d

xfm

d

xgn

x

xhn

x

xin

x

xjn

x

xkn

x

xln

x

xme

n

xne

n

xoe

n

xpe

n

-0.2

-0.1

0.0

0.1

0.2

Sim

ca-P

3.0

1 b

y U

me

tri

AB

19

98

-02

-10

13

:06

PROC1A.M3 (PLS), pls, obs 1-69, OK, Workset

CoeffCS, X/Y: y8*, Comp 5(Cum)

Y6: R2 = 0.81, Q2 = 0.65; Y8: R2 = 0.71, Q2=0.65;

1. Lack of control 2. Model may be somewhat inadequate

Page 31: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

31Wold-Kettaneh PCA & PLS ASQC, May 98

VIP (Variable Importance for Projection), for xk :

VIP = { Σa wak2 SS(Y)a / SS(Y)total K } 1/2

x9

md

xa

md

xm

en

xfm

d

xc

md

xe

md

xo

en

xp

en

xjn

x

x1

in

xh

nx

x4

in

x3

in

xk

nx

x7

in

xd

md

x6

in

x8

md

xb

md

x2

in

x5

in

xin

x

xn

en

xln

x

xg

nx

0 .0

0.2

0.4

0.6

0.8

1.0

1.2

1.4

1.6

Sim

ca-P

3.0

1 b

y U

me

tri

AB

19

98

-02

-10

13

:23

PROC1A.M3 (PLS), pls, obs 1-69: VIP, Comp 5(Cum)

VIP

[5] Heuristic

cutoff :

VIP ≈ 0.8

Page 32: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

32Wold-Kettaneh PCA & PLS ASQ , May 98

Other Process Related Multivariate Applications

X X=chemical composition and Y=product properties, e.g.,tensile strength and internal viscosity of a polymer,or the taste of a beer.

X Multivariate characterization of raw materials, catalysts,and additives. PCA provides simple maps that showsimilarities and dissimilarities between different variantsof the materials, useful for, e.g., designed experiments.

X X= multivariate sensors (e.g., digitized spectra or …), orX= "Soft sensors" (unspecific process data) ⇒ YY = product properties, e.g., mol. weight, completeness ofa polymerization, or the degree of delignification of pulp

Page 33: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

33

Graphics and pictures⇒ Emotions, Understanding, Control

X The process (or the material, catalyst, reaction, …)

inside a certain interval in (a) scores, (b) residuals, DModX

corresponds to “OK”

outside these intervals: alarm

interpretation (which variables)

hints for cure: above + experience

X Gives growth of knowledge and insight ( = experience)

Wold- Kettaneh PCA & PLS ASQ , May 98

Page 34: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

34

ANALYZING COMPLICATED DATA SETSby PCA and PLS

X Lots of data are measured on processes, products, materials, …

X Two complementing ways to use data– Detailed models (engineering process control, ….)

– “Soft” statistical models (monitoring, modelling, …)

X Multivariate analysis by means of projections– Data table (matrix) ↔ Point swarm in space with K axes

– (Hyper) plane = good approximation of data (Taylor expansion)

– “Normal” process is close to this plane and in a limited domain (in T)

– This domain + distance to plane ⇒ graphs, plots

X Many of applications, several already running

X Lots of data are GOOD if they are used appropriately

Wold- Kettaneh PCA & PLS ASQ , May 98

Page 35: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

35

Running applications (most still are off-line)

X Astra, Pharmacia-Upjohn, Novo, Novartis, Merck , …X LKAB, Noranda, SSAB, Avesta-Sheffield, ...X MoDo, ASSI, Stora, SCA, Weyerhaeuser, Noranda …X Harris, IBM, Ericsson, ABB, …X Exxon, Shell, Norsk Hydro, Statoil, …X Hoechst-Celanese, Perstorp, Novachem, Akzo-Nobel, ...X Umeå Energi, Umeå Mejeri (Dairy), …

X Pharmaceuticals, Mining, Paper-Pulp, Semiconductors, Oil,Polymers, Chemicals, Incinerators, Wine, Beer, Whisky,Cheese, Cosmetics, ….

Wold- Kettaneh PCA & PLS ASQ , May 98

Page 36: Multivariate SPC (MSPC), and other process applicationsasq.org/statistics/1998/06/multivariate-spc.pdfMultivariate SPC ... Wold-Kettaneh PCA & PLS ASQ , May 98 5 Tables are not useful

36

Some referencesBurnham, A., Viveros, R., and MacGregor, J.F. 1996. Frameworks for Latent

Variable Multivariate Regression. J.Chemometrics 10, 31-45.J.E. Jackson. A User's guide to principal components. Wiley, N.Y., 1991.Kourti, T, and MacGregor, J.F. 1995. Process Analysis, Monitoring and Diagnosis

Using Multivariate Projection Methods. Chemom.Intell.Lab.Syst. 28, 3-21.Kresta, J.V., MacGregor, J.F., and Marlin, T.E. 1991. Multivariate Statistical

Monitoring of Process Operating Performance. Can.J.Chem.Eng. 69, 35-47.Michel Tenenhaus. La Regression PLS: Theorie et Pratique. Technip, Paris, 1998.Wold, H., 1982. Soft modeling. The basic design and some extensions. Chapter 1

in Vol.II of Jöreskog, K.-G., and Wold, H., Ed.s. Systems under indirect observation,Vol.s I and II. North- Holland, Amsterdam.

Wold, S., Johansson, E., and Cocchi, M., 1993. PLS -- Partial least-squaresprojections to latent structures. In Kubinyi, H., Ed., 3D QSAR in Drug Design; Theory,Methods and Applications. ESCOM Science Publishers, Leiden, Holland.

Wold, S., Ruhe, A., Wold, H., and Dunn III, W.J., 1984. The Collinearity Problemin Linear Regression. The Partial Least Squares Approach to Generalized Inverses,SIAM J. Sci. Stat. Comput. 5, 735-743.

Wold- Kettaneh PCA & PLS ASQ , May 98