23
Principal Component Analysis for SPAT PG course Joanna D. Haigh

Principal Component Analysis for SPAT PG course

  • Upload
    niel

  • View
    38

  • Download
    0

Embed Size (px)

DESCRIPTION

Principal Component Analysis for SPAT PG course. Joanna D. Haigh. PCA also known as…. Empirical Orthogonal Function (EOF) Analysis Singular Value Decomposition Hotelling Transform Karhunen-Loève Transform. Purpose/applications. - PowerPoint PPT Presentation

Citation preview

Page 1: Principal Component Analysis for SPAT PG course

Principal Component Analysis

for SPAT PG course

Joanna D. Haigh

Page 2: Principal Component Analysis for SPAT PG course

PCA also known as…

• Empirical Orthogonal Function (EOF) Analysis

• Singular Value Decomposition• Hotelling Transform• Karhunen-Loève Transform

11 Nov 2013

Page 3: Principal Component Analysis for SPAT PG course

Purpose/applications

• To identify internal structure in a dataset (e.g. “modes of variability”)

• Data compression – by identifying redundancy, reducing dimensionality

• Noise reduction• Feature identification, classification….

11 Nov 2013

Page 4: Principal Component Analysis for SPAT PG course

Basic approach

Data measured as function of two variables • E.g. surface pressure (space, time)• If measurements at two points in space are highly

correlated in time then we only need one measure (not two) as a function of time to identify their behaviour.

• How many measures we need overall depends on correlations between each point and every other.

11 Nov 2013

Page 5: Principal Component Analysis for SPAT PG course

Correlations

11 Nov 2013

value at point 1va

lue

at p

oint

2

• measurements at point 1 and point 2 highly correlated• main (average) signal is measure in direction of PC1• deviations (the interesting bit?) are in PC2

PC1PC2

• to calculate PCs we need to rotate axes• with M points just rotate in M dimensions

1

2

Page 6: Principal Component Analysis for SPAT PG course

11 Nov 2013

ApproachE.g. data measured N times at M spatial pointsIn M-dimensional spacei. Find axis of greatest correlation, i.e. main

variability, this is PC1.ii. Find axis orthogonal to this of next highest

variability, this is PC2.iii. Continue until M new axes, i.e. M PCs.Each PC is composed of a weighted average of the

original axes. The weightings are the EOFs.

Page 7: Principal Component Analysis for SPAT PG course

Concept

• Often it is possible to identify a particular mode/feature with an EOF.

• Each PC indicates the variation with time (in our example) of the mode identified with its EOF.

• Once EOFs established can project other datasets (e.g. different time periods) onto them to compare behaviours.

11 Nov 2013

Page 8: Principal Component Analysis for SPAT PG course

ENSO as EOF1 of SST data

• EOF1 of tropical Pacific SSTs:576 monthly anomalies Jan 1950 - Dec 1997• EOF1 explains 45% of the total SST variance

over this domain.

11 Nov 2013

http://www.esrl.noaa.gov/psd/enso/impacts/currentclimo.html

Page 9: Principal Component Analysis for SPAT PG course

Maths

• Calculate MxM covariance matrix• Find eigenvectors and eigenvalues• EOFs are the M eigenvectors, ranked in

order of decreasing eigenvalue• Eigenvalues give measure of variance• PCs from decomposition of data onto

EOFs.

11 Nov 2013

Page 10: Principal Component Analysis for SPAT PG course

Examples of applications

11 Nov 2013

Application M N Visualise data EOFs:weightings of

PCs

Meteorology space time Time series at each place (or map at each time)

places(maps)

Time series of EOFs maps

Earth obs(e.g. land cover)

spectral bands

space Map in each wavelength band

bands Maps of band combos

Earth obs(e.g. cloud)

cases wave-length

Spectrum for each case

cases Spectra of case combos

Polarity of IMF

Solar longitude

time IMF polarity f(longitude) at each time

longitudes Time series of lon. distbn

Page 11: Principal Component Analysis for SPAT PG course

High cloud E. Asia

Kang et al (1997)

11 Nov 2013

Page 12: Principal Component Analysis for SPAT PG course

Southern Annular Mode

geopotential height of 1000hPa surface

11 Nov 2013

Page 13: Principal Component Analysis for SPAT PG course

Examples of applications

11 Nov 2013

Application M N Visualise data EOFs:weightings of

PCs

Meteorology space time Time series at each place (or map at each time)

places(maps)

Time series of EOFs maps

Earth obs(e.g. land cover)

spectral bands

space Map in each wavelength band

bands Maps of band combos

Earth obs(e.g. cloud)

cases wave-length

Spectrum for each case

cases Spectra of case combos

Polarity of IMF

Solar longitude

time IMF polarity f(longitude) at each time

longitudes Time series of lon. distbn

Page 14: Principal Component Analysis for SPAT PG course

Landsat Thematic Mapper (Wageningen)

11 Nov 2013

0.5 0.6 0.7 µm

0.8 1.6 2.2

Page 15: Principal Component Analysis for SPAT PG course

example of TM EOFs (unnormalised)

[NB not for Wageningen images]

11 Nov 2013

µm0.50.60.7 0.81.62.211.5

eigenvalues: 1011 131 38 7 4 2 1 EOF: 1 2 3 4 5 6 7

Page 16: Principal Component Analysis for SPAT PG course

Examples of applications

11 Nov 2013

Application M N Visualise data EOFs:weightings of

PCs

Meteorology space time Time series at each place (or map at each time)

places(maps)

Time series of EOFs maps

Earth obs(e.g. land cover)

spectral bands

space Map in each wavelength band

bands Maps of band combos

Earth obs(e.g. cloud)

cases wave-length

Spectrum for each case

cases Spectra of case combos

Polarity of IMF

Solar longitude

time IMF polarity f(longitude) at each time

longitudes Time series of lon. distbn

Page 17: Principal Component Analysis for SPAT PG course

Modelled IR spectra of cirrus cloud

Bantges et al (1999)

11 Nov 2013

Page 18: Principal Component Analysis for SPAT PG course

PC0: Average

PC1: Ice water path

PC2: Effective radius

PC3: Aspect ratio

Bantges et al (1999) 11 Nov 2013

Page 19: Principal Component Analysis for SPAT PG course

Examples of applications

11 Nov 2013

Application M N Visualise data EOFs:weightings of

PCs

Meteorology space time Time series at each place (or map at each time)

places(maps)

Time series of EOFs maps

Earth obs(e.g. land cover)

spectral bands

space Map in each wavelength band

bands Maps of band combos

Earth obs(e.g. cloud)

cases wave-length

Spectrum for each case

cases Spectra of case combos

Polarity of IMF

Solar longitude

time IMF polarity f(longitude) at each time

longitudes Time series of lon. distbn

Page 20: Principal Component Analysis for SPAT PG course

Polarity of Interplanetary Magnetic Field

11 Nov 2013Cadavid et al 2007

Page 21: Principal Component Analysis for SPAT PG course

Maths – a little more detailRepresent data by MxN matrix DMxM covariance matrix is C = (D – D)(D – D)T

Calculate i=1,M eigenvalues λi & eigenvectors vi

EOFs in MxM matrix of eigenvectors EMxN matrix of PCs P = ET D

NB can rewrite D = (ET)-1 P = E P (E Hermitian)i.e. PCs give weighting of EOFs in data

11 Nov 2013

Page 22: Principal Component Analysis for SPAT PG course

Data reduction/noise removal

• Higher order PCs are composed of lowest correlations so uncorrelated noise lies in these.

• Can reconstruct data omitting higher order EOFs to reduce noise.

• Can reduce data by keeping only PCs of lowest order EOFs.

11 Nov 2013

Page 23: Principal Component Analysis for SPAT PG course

Books

R W Priesendorfer 1988PCA in meteorology and oceanographyElsevier

I T Jolliffe 2002Principal component analysisSpringer

11 Nov 2013