Latent Gaussian Processes for Distribution Estimation of...

Latent Gaussian Processes for DistributionEstimation of Multivariate Categorical Data

Yarin Gal • Yutian Chen • Zoubin Ghahramani

yg279@cam.ac.uk

Multivariate Categorical Data

2 of 8

Distribution Estimation

GivenD = {yn = (yn1, ..., ynD) | n = 1, ...,N}

withynd ∈ {1, ...,K},

we want to learn a distribution

from D.

3 of 8

Example: Wisconsin Breast CancerSample code Thickness Unif. Cell Size Unif. Cell Shape Marginal Adhesion Epithelial Cell Size Bare Nuclei Bland Chromatin Normal Nucleoli Mitoses Class

1000025 5 1 1 1 2 1 3 1 1 Benign1002945 5 4 4 5 7 10 3 2 1 Benign1015425 3 1 1 1 2 2 3 1 1 Benign1016277 6 8 8 1 3 4 3 7 1 Benign1017023 4 1 1 3 2 1 3 1 1 Benign1017122 8 10 10 8 7 10 9 7 1 Malignant1018099 1 1 1 1 2 10 3 1 1 Benign1018561 2 1 2 1 2 1 3 1 1 Benign1033078 2 1 1 1 2 1 1 1 5 Benign1033078 4 2 1 1 2 1 2 1 1 Benign1035283 1 1 1 1 1 1 3 1 1 Benign1036172 2 1 1 1 2 1 2 1 1 Benign1041801 5 3 3 3 2 3 4 4 1 Malignant1043999 1 1 1 1 2 3 3 1 1 Benign1044572 8 7 5 10 7 9 5 5 4 Malignant1047630 7 4 6 4 6 1 4 3 1 Malignant1048672 4 1 1 1 2 1 2 1 1 Benign1049815 4 1 1 1 2 1 3 1 1 Benign1050670 10 7 7 6 4 10 4 1 2 Malignant1050718 6 1 1 1 2 1 3 1 1 Benign1054590 7 3 2 10 5 10 5 4 4 Malignant1054593 10 5 5 3 6 7 7 10 1 Malignant1056784 3 1 1 1 2 1 2 1 1 Benign1057013 8 4 5 1 2 ? 7 3 1 Malignant1059552 1 1 1 1 2 1 3 1 1 Benign1065726 5 2 3 4 2 7 3 6 1 Malignant1066373 3 2 1 1 1 1 2 1 1 Benign1066979 5 1 1 1 2 1 2 1 1 Benign1067444 2 1 1 1 2 1 2 1 1 Benign1070935 1 1 3 1 2 1 1 1 1 Benign1070935 3 1 1 1 1 1 2 1 1 Benign1071760 2 1 1 1 2 1 3 1 1 Benign1072179 10 7 7 3 8 5 7 4 3 Malignant

683 patients, 2× 109 possible configurations4 of 8

Categorical Latent Gaussian Process

We define the generative model as

Continuouslatent spaceof patients

Dist. overfunctions

generatingweights fndk

yndiid∼ Softmax(fnd1, ..., fndK )

test resultyn = (yn1, ..., ynD)

medical assessment

N patients xn, D categorical tests, each with K possible test results

5 of 8

1. Sparse GPs→ linear time complexity.2. Monte Carlo integration for the non-conjugate likelihood→

noisy gradients.3. Learning-rate free stochastic optimisation→ no fine-tuning

of learning-rates.4. Symbolic differentiation→ simple and modular code

1 import theano.tensor as T

2 X = m + s * randn(N, Q)

3 U = mu + L.dot(randn(M, K))

4 Kmm, Kmn, Knn = RBF(sf2, l, Z), RBF(sf2, l, Z, X), RBFnn(sf2, l, X)

5 KmmInv = sT.matrix_inverse(Kmm)

6 A = KmmInv.dot(Kmn)

7 B = Knn - T.sum(Kmn * KmmInv.dot(Kmn), 0)

8 F = A.T.dot(U) + B[:,None]**0.5 * randn(N, K)

9 S = T.nnet.softmax(F)

10 KL_U, KL_X = get_KL_U(), get_KL_X()

11 LS = T.sum(T.log(T.sum(Y * S, 1))) - KL_U - KL_X

12 LS_func = theano.function([’’’inputs’’’], LS)

13 dLS_dm = theano.function([’’’inputs’’’], T.grad(LS, m)) # and others

14 # ... and run RMS-PROP

6 of 8

Back to Breast Cancer...

Uniform

(Discrete)Bigram

Dirichlet–Multinomial

(Linear)Latent

GaussianModel

CategoricalLatent

GaussianProcess

8.68 3.41 3.57± 0.208 2.86± 0.119

Model perplexity, predicting randomly missing test results

(Lots more results in the poster!)

7 of 8

Thank you

Latent Gaussian Processes for DistributionEstimation of Multivariate Categorical Data

Gaussian process regression

Gaussian process classification

Logistic regression

Gaussian process latent variable model

Factor analysis

Latent Gaussian models

Linear regression

Linear Non-linear

Continuous

Discrete

Observed input

Latent input

Please come and see our poster

8 of 8

Latent Gaussian Processes for Distribution Estimation of...

Documents

Alternating Direction Methods for Latent Variable Gaussian ...users.stat.umn.edu/~zouxx019/Papers/lvgm.final.pdf · Alternating Direction Methods for Latent Variable Gaussian Graphical

All you want to know about GPs: Gaussian Process Latent ...ttic.uchicago.edu/~rurtasun/tutorials/raquel_gplvm.pdfAll you want to know about GPs: Gaussian Process Latent Variable Model

Latent Gaussian Processes for Distribution Estimation of ...mlg.eng.cam.ac.uk/yarin/PDFs/ICML_Latent_presentation.pdfScaling Up! I Why not scale the model up? I We recently showed

Variational Inference, Gaussian Processes and non-linear ...mlg.eng.cam.ac.uk/carl/talks/fusion.pdf · Variational Inference, Gaussian Processes and non-linear state-space models

Localized discriminative Gaussian process latent variable ... · PDF fileLocalized discriminative Gaussian process latent variable model for text ... we use Discriminative Gaussian

Gaussian Process Structural Equation Models with Latent Variables

Speeding Up Latent Variable Gaussian Graphical Model ... · is the latent variable Gaussian graphical model (LVGGM), which was proposed in [9], and later investigated in [22, 24]

Gaussian Graphical Models with latent structure

A latent Gaussian model for compositional data with structural zeroes

Variational Inference for Latent Variables and …...of MAP estimation over the latent variables which is currently the standard way to train GP-LVMs. 2.1 Gaussian Processes for Latent

All you want to know about GPs: Gaussian Process Latent ...urtasun/tutorials/raquel...All you want to know about GPs: Gaussian Process Latent Variable Model Raquel Urtasun and Neil

The Gaussian Process Latent Variable Model (GPLVM)

Variational Inference in Gaussian Processes for non-linear ...mlg.eng.cam.ac.uk/carl/talks/brusselsbench.pdf · Rasmussen (Engineering, Cambridge) Var Inf in GP for time series Brussels,

Approximate Inference for Deep Latent Gaussian Mixturesenalisni/BDL_paper20.pdf · Approximate Inference for Deep Latent Gaussian Mixtures Eric Nalisnick1, Lars Hertel2, and Padhraic

LNAI 3720 - $mathcal{U}$-Likelihood and ...mlg.eng.cam.ac.uk/pub/pdf/SunBanCho05a.pdfU-Likelihood and U-Updating Algorithms: Statistical Inference in Latent

Bayesian Filtering with Online Gaussian Process Latent ...urtasun/publications/wang_et_al_uai14.… · Bayesian Filtering with Online Gaussian Process Latent Variable Models Yali

Learning GP-BayesFilters via Gaussian process latent ... · LEARN, a framework for training GP-BayesFilters without ground truth states. Our approach extends Gaussian Process Latent

Non-reversible Gaussian processes for identifying latent … · 2020. 11. 5. · Non-reversible Gaussian processes for identifying latent dynamical structure in neural data Virginia

Learning latent variable structured prediction models with Gaussian perturbations · Learning latent variable structured prediction models with Gaussian perturbations Kevin Bello

Isometric Gaussian Process Latent Variable Model for