78
Deep Generative Models with Stick-Breaking Priors Eric Nalisnick University of California, Irvine

Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Deep Generative Models with Stick-Breaking Priors

Eric NalisnickUniversity of California, Irvine

Page 2: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Overview of Deep Generative Models.1

Outline

DefinitionApplicationsTraining

Page 3: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Overview of Deep Generative Models.1

Outline

The Dirichlet Process.2

Applications

Stochastic Gradient Variational Bayes for the DP

Definition

Definition

Training

Page 4: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Overview of Deep Generative Models.

Deep Generative Models with Stick-Breaking Priors.

1

3

Outline

Stick-Breaking Variational Autoencoders

The Dirichlet Process.2

Applications

Dirichlet Process Variational AutoencodersStick-Breaking for Adaptive Computation

Stochastic Gradient Variational Bayes for the DP

Definition

Definition

Training

Page 5: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Overview of Deep Generative Models

Page 6: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

“Generative modeling…denotes modeling tasks in which a density over all the observable quantities is constructed.”

David MacKayInformation Theory, Inference, and Learning Algorithms

p( y | x ) p( x, y )p( x )

Page 7: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Deep Generative Models

Density Network (MacKay & Gibbs, 1997)

XZ

One or More Neural Network Layers

Observed Data

Stochastic Latent Variable

Page 8: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Deep Generative Models

Density Network (MacKay & Gibbs, 1997)

XZ

One or More Neural Network Layers

Observed Data

Stochastic Latent Variable

Z d

X N

θ λ

Page 9: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Sampling from a Deep Generative Model

XZ^~

Page 10: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Sampling from a Deep Generative Model

XZ^~

ImageNet

Page 11: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Sampling from a Deep Generative Model

XZ^~

ImageNet Samples from a DGM

Page 12: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

XZ^~

Sampling from a Deep Generative Model

Samples from a DGM

(Bowman et al., 2016)

Page 13: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Overview of Deep Generative Models

1.1 Why Generative Modeling?

Page 14: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Why Generative Modeling?

Cognitive Motivations.1

Page 15: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Why Generative Modeling?

2 Semi-Supervised Learning: Regularizing discriminative models with unlabeled data.

} p( y | x )

Page 16: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Why Generative Modeling?

2

} p( x, y )

Semi-Supervised Learning: Regularizing discriminative models with unlabeled data.

Page 17: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Why Generative Modeling?

3 Simulating Environments for Reinforcement Learning.

Page 18: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Overview of Deep Generative Models

1.2 How to Train a Deep Generative Model

Page 19: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

1 Importance Sampling

2 Variational Inference

3 Adversarial Training

Soph

istic

atio

n

Page 20: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

1 Importance Sampling(MacKay & Gibbs, 1997)

Page 21: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

1 Importance Sampling(MacKay & Gibbs, 1997)

Page 22: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

1 Importance Sampling

Works only in low dimensions

(MacKay & Gibbs, 1997)

Page 23: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

2 Variational Inference

Page 24: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

2 Variational InferencePosterior Approximation

Page 25: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

2 Variational InferencePosterior Approximation

Page 26: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

Data Reconstruction Regularization

2 Variational InferencePosterior Approximation

Page 27: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

Problem: For neural network likelihoods, these integrals are analytically intractable

Data Reconstruction Regularization

2 Variational InferencePosterior Approximation

Page 28: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

Step #1: Use Monte Carlo Approximations

2 Variational Inference

Page 29: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

Step #1: Use Monte Carlo Approximations

Problem: Still can’t compute gradient w.r.t. variational parameters

2 Variational Inference

Page 30: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

Step #1: Use Monte Carlo Approximations

Problem: Still can’t compute gradient w.r.t. variational parameters

Step #2: Sample via Differentiable Non-Centered Parametrizations

2 Variational Inference

Page 31: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

(Kingma & Welling, 2014), (Rezende et al., 2014)

The Variational Autoencoder (VAE)

}

Page 32: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

Classifier predicting if x is generated or observed data

3 Adversarial Training(Goodfellow et al., 2014)

Page 33: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Learning a Deep Generative Model

Classifier predicting if x is generated or observed data

1 Minimize with respect to

Two step training:

2 Maximize with respect to

3 Adversarial Training(Goodfellow et al., 2014)

Page 34: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Generative Adversarial Networks (GANs)(Goodfellow et al., 2014)

Page 35: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Variational Bayes vs Adversarial Training

VAE GAN

Page 36: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Deep Generative Models Summary

XZ

(Original) Density Network

Variational Autoencoder

Generative Adversarial Network

Training Method

Importance Sampled Estimate of Marginal

Likelihood

Stochastic Gradient Variational Bayes

Adversarial Training

No auxiliary model

Backend auxiliary model

Frontend auxiliary model

Page 37: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Deep Generative Models Summary

XZ

(Original) Density Network

Variational Autoencoder

Generative Adversarial Network

Training Method

Importance Sampled Estimate of Marginal

Likelihood

Stochastic Gradient Variational Bayes

Adversarial Training

No auxiliary model

Backend auxiliary model

Frontend auxiliary model

Page 38: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

Page 39: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

Page 40: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

1

Page 41: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

1 2

Page 42: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

1 2

Draw from GEM distribution

Page 43: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

1 2

Draw from GEM distribution

Page 44: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

1 2

Draw from GEM distribution

Page 45: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

1 2

Draw from GEM distribution

Page 46: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

The Dirichlet Process

1 2

Draw from GEM distribution

Page 47: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

SGVB for The Dirichlet Process

Two Obstacles:1. Gradients through 2. Gradients thought samples from the mixture

Page 48: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Obstacle: The Beta distribution does not have a non-centered parametrization (except in special cases)

Differentiating Through πk

Page 49: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Obstacle: The Beta distribution does not have a non-centered parametrization (except in special cases)

Kumaraswamy Distribution: A Beta-like distribution with a closed-form inverse CDF. Use as variational posterior.

Poondi Kumaraswamy (1930-1988)

Differentiating Through πk

Page 50: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Obstacle: Not obvious how to use the reparametrization trick for samples from a mixture distribution.

Stochastic Backpropagation through Mixtures

Page 51: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Obstacle: Not obvious how to use the reparametrization trick for samples from a mixture distribution.

Stochastic Backpropagation through Mixtures

Brute Force Solution: Summing over all components results in a tractable ELBO but requires O(T) decoder propagations.

Page 52: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Deep Generative Models with Stick-Breaking Priors

Page 53: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Why Combine the DP with DGMs?

XZ

Latent variable drawn from

stochastic process

Page 54: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Why Combine the DP with DGMs?

1 Dynamic Latent Representations (Width)

XZ

Latent variable drawn from

stochastic process

Page 55: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Why Combine the DP with DGMs?

1 Dynamic Latent Representations (Width)

2 Neural Networks with Adaptive Computation (Depth)

XZ

Latent variable drawn from

stochastic process

Page 56: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Why Combine the DP with DGMs?

1 Dynamic Latent Representations (Width)

2

Automatic and Principled Model Selection (Based on Marginal Likelihood)

3

Neural Networks with Adaptive Computation (Depth)

XZ

Latent variable drawn from

stochastic process

Page 57: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

3.1 The Stick-Breaking Variational Autoencoder

In collaboration with

Padhraic Smyth

Deep Generative Models with Stick-Breaking Priors

Page 58: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

MODEL #1: Stick-Breaking Variational Autoencoder

Kumaraswamy Samples

Truncated posterior; not necessary but learns faster

(Nalisnick & Smyth, 2017)

Page 59: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Stick-Breaking VAE

Truncation level of 50 dimensions, Beta(1,5) Prior

Samples from Generative Model (Nalisnick & Smyth, 2017)

Page 60: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Stick-Breaking VAE Gaussian VAE

50 dimensions, N(0,1) PriorTruncation level of 50 dimensions, Beta(1,5) Prior

Samples from Generative Model (Nalisnick & Smyth, 2017)

Page 61: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Uns

uper

vise

dSe

mi-

Supe

rvis

ed

MNIST: Dirichlet Latent Space (t-SNE)

MNIST: Gaussian Latent Space (t-SNE)

MNIST: kNN Classifier on Latent Space

Quantitative Results for SB-VAE (Nalisnick & Smyth, 2017)

Page 62: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Uns

uper

vise

dSe

mi-

Supe

rvis

ed

MNIST: Dirichlet Latent Space (t-SNE)

MNIST: Gaussian Latent Space (t-SNE)

MNIST: kNN Classifier on Latent Space

Quantitative Results for SB-VAE (Nalisnick & Smyth, 2017)

(Estimated) Marginal Likelihoods

Page 63: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Uns

uper

vise

dSe

mi-

Supe

rvis

ed

MNIST: Dirichlet Latent Space (t-SNE)

MNIST: Gaussian Latent Space (t-SNE)

MNIST: kNN Classifier on Latent Space

Nonparametric version of (Kingma et al., NIPS 2014)’s M2 model

Quantitative Results for SB-VAE (Nalisnick & Smyth, 2017)

(Estimated) Marginal Likelihoods

Page 64: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

3.2 The Dirichlet Process Variational Autoencoder

Lars Hertel

In collaboration with

Padhraic Smyth

Deep Generative Models with Stick-Breaking Priors

Page 65: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

MODEL #2: Dirichlet Process Variational Autoencoder (Nalisnick et al., 2016)

Page 66: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

MNIST Samples from Component #1

MNIST Samples from Component #5

(Nalisnick et al., 2016)

Samples from Generative Model

Page 67: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

MNIST: Dirichlet Process Latent Space (t-SNE)

MNIST: Gaussian Latent Space (t-SNE)

MNIST: kNN Classifier on Latent Space

Quantitative Results for DP-VAE (Nalisnick et al., 2016)

Page 68: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

MNIST: Dirichlet Process Latent Space (t-SNE)

MNIST: Gaussian Latent Space (t-SNE)

MNIST: kNN Classifier on Latent Space

Quantitative Results for DP-VAE (Nalisnick et al., 2016)

(Estimated) Marginal Likelihoods

Page 69: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

3.3 Adaptive Computation Time via Stick-Breaking

Marc-Alexandre Cote

In collaboration with

Hugo Larochelle

Deep Generative Models with Stick-Breaking Priors

Page 70: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

MODEL #3: Stick-Breaking Adaptive Computation

yky2y1

X

π1 π2 πk

Remaining stick is negligibly small

Page 71: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

MODEL #3: Stick-Breaking Adaptive Computation

yky2y1

X

π1 π2 πk

Remaining stick is negligibly small

Page 72: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Quantitative Results for SB-RNN: Image Modeling

Page 73: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Quantitative Results for SB-RNN: Text Modeling

Page 74: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Quantitative Results for SB-RNN: Text Modeling

Page 75: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Superior Latent Spaces with NP priors: Seem to preserve class structure well, resulting in better discriminative properties. Their dynamic capacity naturally encodes factors of variation.

1

Conclusions

Adaptive Computation: Fusing NNs with nonparametric priors endows NNs with natural model selection properties.

2

Page 76: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Thank you. Questions?

Page 77: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of

Appendix

Page 78: Deep Generative Models with Stick-Breaking Priorsenalisni/nalisnick_uciAIML_talk.pdfnot necessary but learns faster (Nalisnick & Smyth, 2017) Stick-Breaking VAE Truncation level of