26
PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules LECTURE 8 Supplementary Readings : Wilks, chapters 9

PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

  • Upload
    sera

  • View
    65

  • Download
    0

Embed Size (px)

DESCRIPTION

LECTURE 8. PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules. Supplementary Readings : Wilks , chapters 9. WE’LL START OUT WITH AN EXAMPLE: 20th GLOBAL SURFACE TEMPERATURE RECORD. Surface Temperature Changes. - PowerPoint PPT Presentation

Citation preview

Page 1: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

PRINCIPAL COMPONENT ANALYSIS(PCA)

EOFs and Principle Components; Selection Rules

LECTURE 8

Supplementary Readings:

Wilks, chapters 9

Page 2: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

WE’LL START OUT WITH AN EXAMPLE: 20th GLOBAL SURFACE TEMPERATURE RECORD

Page 3: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

Climatic Research Unit (‘CRU’), University of East Anglia

Surface Temperature Changes

Page 4: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

EOFs for the five leading eigenvectors of the global temperature data from 1902-1980.

The gridpoint areal weighting factor used in the PCA procedure has been removed from the EOFs so that relative temperature anomalies can be inferred from the patterns.

12% (88%)

6% (3%)

5% (1%)

4% (1%)

3% (0.5%)

EOF #1

EOF #2

EOF #3

EOF #4

EOF #5

Page 5: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

SURFACE TEMPERATURE RECORD FILTERED BY RETAINING PROJECTION ONTO WITH FIRST FIVE EIGENVECTORS

FILTERING THROUGH PCA

Page 6: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

GLOBAL TEMPERATURE TREND

EOF #1

PC #1

Page 7: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

Multivariate ENSO Index

(“MEI”)

EOF #2PC #2

EL NINO/SOUTHERN OSCILLATION (ENSO)

Page 8: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

NORTH ATLANTIC OSCILLATION

EOF #3

PC #3

Page 9: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

EOF #3

PC #3

NORTH ATLANTIC OSCILLATION

Page 10: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

TROPICAL ATLANTIC “DIPOLE”

EOF #3

PC #3

Page 11: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

ATLANTIC MULTIDECADAL OSCILLATION

EOF #5

PC #5

Page 12: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

EOF #5

PC #5

ATLANTIC MULTIDECADAL OSCILLATION

Page 13: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

EOF #5

PC #5

ATLANTIC MULTIDECADAL OSCILLATION

Page 14: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

PCA as an SVD on the Data Matrix X

Page 15: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

Recall from our earlier lecture the variance-covariance matrix A in the multivariate regression problem:

The eigenvectors of A comprise an orthogonal predictor set

cbA

2321

32

32313

2322

212

131212

1

ˆ...ˆˆˆˆˆˆ..

.

...

.

.

ˆˆ...ˆˆˆˆˆ

ˆˆ...ˆˆˆˆˆ

ˆˆˆˆˆˆˆ

ix

ixi

xi

xi

xixi

x

ixi

xi

xi

xi

xixi

xi

xi

xi

xi

xi

xixi

xi

xix

ixix

ixix

ix

MMMM

M

M

M

A

(Principal Components Regression)

Page 16: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

Let us return to the data matrix,(assume it has zero mean)

Nxxx

NxxxN

xxxNxxx

MMMˆ......

.

...

.

...

ˆ...2

ˆ1

ˆ

ˆ......2

ˆ1

ˆ

ˆ......2

ˆ1

ˆ

333

222

111

X

TVSUX

M

k

Tkkk

1

vuVSUXT

We can write

Where U,V are unitary matrices (orthogonal matrices if X is real-valued), U is MxN, S is diagonal NxN, and V is NxN

Singular Value Decomposition (SVD)

Assume M>N (overdetermined; greater number of “equations” than “unknowns”)

Page 17: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

TTVSUX

M

k

Tkkk

1

vuVSUXTT

We can then write

Where U, V are unitary matrices (orthogonal matrices if X is real-valued), U is NxM, S is diagonal MxM, and V is MxM

Singular Value Decomposition (SVD)

Typically, we are interested in the case N>M.

A revised overdetermined problem can be obtained by redefining the problem:

M

k

Tkkk

TTTTT

1

uvUSVUSVVSUX

Nxxx

NxxxN

xxxNxxx

MMMˆ......

.

...

.

...

ˆ...2

ˆ1

ˆ

ˆ......2

ˆ1

ˆ

ˆ......2

ˆ1

ˆ

333

222

111

X

Page 18: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

A

2

321

3

2

32313

232

2

212

13121

2

1

ˆ...ˆˆˆˆˆˆ..

.

...

.

.ˆˆ...ˆˆˆˆˆ

ˆˆ...ˆˆˆˆˆ

ˆˆˆˆˆˆˆ

ix

ixi

xi

xi

xixi

x

ixi

xi

xi

xi

xixi

xi

xi

xi

xi

xi

xixi

xi

xix

ixix

ixix

ix

MMMM

M

M

M

TTT VSUUSVXX

Nx

Nx

Nx

xxx

xxx

xxx

Nxxx

NxxxN

xxxNxxx

T

M

M

M

M

MMMˆ......ˆˆ

.

...

.

...

3ˆ...

2ˆ......

1ˆ......

ˆ......2

ˆ1

ˆ..

.

...

.

.ˆ...

ˆ......2

ˆ1

ˆ

ˆ......2

ˆ1

ˆ

21

21

21

21

333

222

111

XX

TTTMM VSISVVSUUSV

)x(

TVSV 2 12 VSV

Page 19: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

V is a unitary matrix which diagonalizes XXT!

TTT VSUUSVXX

TTTMM VSISVVSUUSV

)x(

TVSV 2 12 VSV

There is a mathematical equivalence between taking the Singular Value Decomposition (SVD) of X, and finding the eigenvectors of A=XXT

Thus, S2 contains the eigenvalues of XXT

Page 20: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

U contains as its columns the temporal patterns or Principal Components (“PC”s) corresponding to the M eigenvalues, which are the “right eigenvectors” of the SVD:

M

k

Tkkk

1

uvUSVXT

SVUX

V contains the as its columns the Spatial Pattern or Empirical Orthogonal Function (“EOF”) corrresponding to the M eigenvalues, which are the “left eigenvectors” of the SVD:

M

kkk

1

v

TTUSXV

M

kkkT

1

u

Page 21: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

M

k

Tkkk

1

vuVSUXT

*

1

*)( M

k

Tkkk

M vuX

We can filter the original data with a subset of M* eigenvectors:

FILTERING WITH EIGENVECTORS

Page 22: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

•Standardization & Areal Weighting•Gappy Data•Frequency domain •“Rotation”•Selection Rules

Some Additional Considerations:

Page 23: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

How many eigenvectors do we consider significant?

•Eigenvalue > 1/M

•Break in slope in eigenvalue spectrum (“Scree” test) or log eigenvalue (“LEV”) spectrum

•Eigenvalue lies outside expected distribution for M uncorrelated Gaussian time series of length N (Preisendorfer Rule N). This is an example of a Monte Carlo method

•Rule N’ (take into account serial correlation)

There is no uniquely defensible criterion...

SELECTION RULES

Page 24: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

Preisdendorfer Rule N

SELECTION RULES

Page 25: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

Asymptotic results of Preisendorfer Rule N for large sample size

(N,M>100 or so)

=N/M

SELECTION RULES

Page 26: PRINCIPAL COMPONENT ANALYSIS(PCA) EOFs and Principle Components; Selection Rules

MATLAB EXAMPLE:

NORTH ATLANTIC SEA LEVEL PRESSURE DATA

1899-1999