18
Variational Gaussian-process fac tor analysis for modeling spati o-temporal data Jaakko Luttinen and Alexande r Ilin NIPS 20 09 Presented by Bo Chen 2.26, 2010

Variational Gaussian-process factor analysis for modeling spatio-temporal data

  • Upload
    olathe

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Variational Gaussian-process factor analysis for modeling spatio-temporal data. Jaakko Luttinen and Alexander Ilin NIPS 2009. Presented by Bo Chen 2.26, 2010. Outline. Introduction---- Factor Analysis (FA) Introduction--- Gaussian Process (GP) - PowerPoint PPT Presentation

Citation preview

Page 1: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Variational Gaussian-process factor analysis for modeling spatio-temporal data

Jaakko Luttinen and Alexander Ilin

NIPS 2009

Presented by Bo Chen 2.26, 2010

Page 2: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Outline

• Introduction---- Factor Analysis (FA)

• Introduction--- Gaussian Process (GP)

• Spatio-Temporal Factor Analysis

• Factor Analysis with GP prior (GPFA)

• Variational Bayesian Inference

• Speeding up GPFA

• Experiments

Page 3: Variational Gaussian-process factor analysis for modeling spatio-temporal data

The Applications of Factor Analysis

• 1. Dimensionality Reduction

• 2. Dictionary Learning (Denoising and Impainting)

• 3. Feature Selection (Gene Analysis)

• 4. Matrix Completion (Regression)

• 5. Spatial Dynamic Data Analysis

• …. Uncover the prominent structure from the data

Page 4: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Gaussian Process

Pros:• Utilize the extra information from the input space• NonlinearityCons:• Computational Complexity

A joint Gaussian distribution over sets of function values {fx} of any arbitrary set of n instances x

))x'K(x, (x),( x f

Introduce the extra information from the input space

Probability distribution over functions

Page 5: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Spatio-Temporal Factor Analysis

W:d: A factor vector spatially distributed

Xd:: Time seires of factor d

Time information

Spatialinformation

Time information

The m-th row of Y corresponds to a spatial location lm (e.g., a location on a two dimensional map) and the n-th column corresponds to a time instance tn

(M. N. Schmidt.,ICML 2009)

Page 6: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Introduce Gaussian Process PriorEach time signal xd: contains values of a latent function X(t)computed at time instances tn.

Each spatial signal w:d contains measurements of a function W(l)at different locations lm.

The likelihood function of the observed data:

Page 7: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Variational Bayesian InferenceThe approximation of the true posterior:

The lower bound of the marginal log-likelihood:

Maximizing the lower bound, we can get

Page 8: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Inferred Posterior

Where Z: is a DNx1 vector formed by concatenation of vectors:

U is a DNxDN block-diagonal matrix with the following DxD matrices on the diagonal:

In the paper, the author assume an isotropic noise:

Page 9: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Speeding Up GPFA (1)

• Component-Wise Factorization

Page 10: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Speeding Up GPFA (2)• Inducing the inputs

If the inducing inputs summarize the data well,

The approximate posterior:

A set of auxiliary variables which contain the values of latentfunctions Wd(l), Xd(t) in some locations

Maximizing the new variational lower bound

We will get

Some VB update details can be found in this paper and M. K. Titsias., AISTATS’09.

and

Page 11: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Computational Complexity

Page 12: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Artificial ExperimentsM=30 sensors (two-dimensional spatial locations)N=200 time instancesD=4 temporal signals xd: generated by taking samples from GPpriors with different covariance kernels, see next page.

The loadings were generated from GPs over the two-dimensional space using the squared exponential covariance kernel.

Data Y: 452 points are selected as observed and the remaining ones as missing.

The hyperparameters of the Gaussian processes were initialized randomly closeto the values used for data generation, assuming that a good guess about the Hidden signals can be obtained by exploratory analysis of data.

Page 13: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Covariance Kernels• Squared exponential function to model a slowly changing

component:

• Periodic function with decay to model a quasi-periodic component:

• Compactly supported piecewise polynomial function to model two fast changing components with different time scales

• Squared exponential to model the spatial information

Page 14: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Results

Page 15: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Reconstruction of Global SST Using the MOHSST5 Dataset

The authors demonstrate how the presented model can be used to reconstruct global sea surface temperatures (SST) from historical measurements.

Data Description:1: U.K. Meteorological Office historical SST data set that contain monthly SST anomalies in the 1856-1991 period for 50x50 longitude-latitude bins.

2. The dataset contains in total approximately 1600 time instances and 1700 spatial locations.

3. The dataset is sparse, especially during the 19th century and the World Wars, having 55% of the values missing, and thus, consisting of more than 106 observations in total.

Available at http://iridl.ldeo.columbia.edu/SOURCES/.KAPLAN/.RSA_MOHSST5.cuf/.OS/.ssta/?help+datafiles

Page 16: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Experimental MethodologyFactor number: D=80 Training set: 20%; Testing set: 80%

Covariance Kernels:1. Five time signals xd: to describe climate trends: the squared exponential kernel.2. Five temporal components to capture periodic signals: quasi-periodic kernel3. Five components to model prominent interannual phenomena such as El Nino: squared exponential kernel4. The rest 65 time signals: piecewise polynomial kernel5. Spatial pattern w:d: scaled squared exponential. The distance r between thelocations li and lj was measured on the surface of the Earth using the sphericallaw of cosines.

Inducing inputs:1. Each spatial function wd(l): 500 inducing inputs2. 15 temporal functions X(t) which modeled slow climate variability: (1) the slowest: 80; (2) quasi-periodic: 300; (3) interannual: 3003. The remaining temporal phenomena: priors with a sparse covariance matrix and therefore allow efficient computations.4. Taking a random subset from the original inputs and then kept fixed throughout learning

Page 17: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Results

El Nino

El Nino

ReconstructionError: 0.5714

ReconstructionError: 0.6180

Page 18: Variational Gaussian-process factor analysis for modeling spatio-temporal data

Conclusions

• 1. Gaussian Process factor analysis used for modeling spatio-temporal phenomena on different scales by using properly selected GPs.

• 2. Infer the parameters using variational Bayesian so as to take into account the uncertainty about the unknown parameters

• 3. Use all available data and combine all modeling assumptions in one estimation procedure