40
Problem Setup Density Estimation using Real NVP Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017

Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Deep Generative Models: Beyond GANs

Raunak Kumar

University of British Columbia

MLRG, 2017 Winter Term 1

November 28, 2017

Page 2: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Overview

1 Problem Setup

2 Density Estimation using Real NVPOutlineChange of VariableCoupling LayersExperimentsSummary

Page 3: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Problem Setup

Page 4: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Density Estimation

Goal: Construct a generative probabilistic model.

Data is high dimensional and highly structured.

Challenge: Build complex yet trainable models.

We’ve seen VAEs and GANs over the past month.

Today we will talk about Density Estimation using Real NVP [1].

If we have time, I will briefly go over Pixel Recurrent NeuralNetworks [2].

Page 5: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Density Estimation

Goal: Construct a generative probabilistic model.

Data is high dimensional and highly structured.

Challenge: Build complex yet trainable models.

We’ve seen VAEs and GANs over the past month.

Today we will talk about Density Estimation using Real NVP [1].

If we have time, I will briefly go over Pixel Recurrent NeuralNetworks [2].

Page 6: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Density Estimation

Goal: Construct a generative probabilistic model.

Data is high dimensional and highly structured.

Challenge: Build complex yet trainable models.

We’ve seen VAEs and GANs over the past month.

Today we will talk about Density Estimation using Real NVP [1].

If we have time, I will briefly go over Pixel Recurrent NeuralNetworks [2].

Page 7: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Density Estimation using Real NVP

Page 8: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Outline

Gameplan

1 Want to use maximum likelihood estimation.

2 Write the likelihood of data in a form that is easy to compute.

3 Compute using deep neural networks.

Page 9: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

Likelihood

Goal: Learn PX (x) using maximum likelihood estimation.

But how do we compute likelihood of the data?

Can we write PX (x) in a way that is easy to compute?

→ Use change of variable formula!

Page 10: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

Likelihood

Goal: Learn PX (x) using maximum likelihood estimation.

But how do we compute likelihood of the data?

Can we write PX (x) in a way that is easy to compute?

→ Use change of variable formula!

Page 11: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

Likelihood

Goal: Learn PX (x) using maximum likelihood estimation.

But how do we compute likelihood of the data?

Can we write PX (x) in a way that is easy to compute?

→ Use change of variable formula!

Page 12: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

Change of Variable

Given:

observed data variable x ∈ X ,

latent variable z ∈ Z with prior probability distribution PZ ,

a bijection f : X → Z with its inverse denoted as g ,

the change of variable formula defines a distribution on x as

PX (x) = PZ (f (x))

∣∣∣∣det

(∂f (x)

∂xT

)∣∣∣∣ . (1)

Proof sketch for 1D case:

1 Express the CDF of X in terms of an integral over Z using thebijection f .

2 Compute the density of X using the Fundamental Theorem ofCalculus and Chain Rule.

Page 13: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

Change of Variable

Given:

observed data variable x ∈ X ,

latent variable z ∈ Z with prior probability distribution PZ ,

a bijection f : X → Z with its inverse denoted as g ,

the change of variable formula defines a distribution on x as

PX (x) = PZ (f (x))

∣∣∣∣det

(∂f (x)

∂xT

)∣∣∣∣ . (1)

Proof sketch for 1D case:

1 Express the CDF of X in terms of an integral over Z using thebijection f .

2 Compute the density of X using the Fundamental Theorem ofCalculus and Chain Rule.

Page 14: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

Change of Variable

Given:

observed data variable x ∈ X ,

latent variable z ∈ Z with prior probability distribution PZ ,

a bijection f : X → Z with its inverse denoted as g ,

the change of variable formula defines a distribution on x as

PX (x) = PZ (f (x))

∣∣∣∣det

(∂f (x)

∂xT

)∣∣∣∣ . (1)

Proof sketch for 1D case:

1 Express the CDF of X in terms of an integral over Z using thebijection f .

2 Compute the density of X using the Fundamental Theorem ofCalculus and Chain Rule.

Page 15: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

Good News

Sampling

Draw z ∼ PZ , and generate a sample x = f −1(z) = g(z).

Inference

PX (x) = product of PZ (f (x)) and its Jacobian determinant.

Page 16: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

Bad News

Data and latent spaces are both very high-dimensional. So:

Jacobian matrix is huge!

Computing the Jacobian is expensive.

Computing its determinant is expensive.

Page 17: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

But . . .

Observe:

Only need the determinant of the Jacobian, not the Jacobian itself.

Determinant of a triangular matrix is the product of its diagonalterms.

→ Design bijective f such that its Jacobian is triangular.→ Use change of variable formula.→ Train using maximum likelihood estimation.→ Once we learn f , we can do sampling and inference.

How do we design such an f ???Why no deep neural networks yet???

Page 18: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

But . . .

Observe:

Only need the determinant of the Jacobian, not the Jacobian itself.

Determinant of a triangular matrix is the product of its diagonalterms.→ Design bijective f such that its Jacobian is triangular.→ Use change of variable formula.→ Train using maximum likelihood estimation.→ Once we learn f , we can do sampling and inference.

How do we design such an f ???Why no deep neural networks yet???

Page 19: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

But . . .

Observe:

Only need the determinant of the Jacobian, not the Jacobian itself.

Determinant of a triangular matrix is the product of its diagonalterms.→ Design bijective f such that its Jacobian is triangular.→ Use change of variable formula.→ Train using maximum likelihood estimation.→ Once we learn f , we can do sampling and inference.

How do we design such an f ???

Why no deep neural networks yet???

Page 20: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Change of Variable

But . . .

Observe:

Only need the determinant of the Jacobian, not the Jacobian itself.

Determinant of a triangular matrix is the product of its diagonalterms.→ Design bijective f such that its Jacobian is triangular.→ Use change of variable formula.→ Train using maximum likelihood estimation.→ Once we learn f , we can do sampling and inference.

How do we design such an f ???Why no deep neural networks yet???

Page 21: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Designing f

Composition of bijections is a bijection.

→ Build f by composing a sequence of simple bijections.

Each such simple bijection is called an affine coupling layer.

Page 22: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Designing f

Composition of bijections is a bijection.→ Build f by composing a sequence of simple bijections.

Each such simple bijection is called an affine coupling layer.

Page 23: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Coupling Layer

Given x ∈ RD and d < D, the output of an affine coupling layer, y , is:

y1:d = x1:d , (2)

yd+1:D = xd+1:D ◦ exp (s(x1:d)) + t(x1:d), (3)

where s, t : Rd → RD−d are arbitrary functions.(arbitrary means deep convolutional neural networks . . . )

Since we are interested in sampling and inference, what about the inverseand the Jacobian determinant?

Page 24: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Coupling Layer

Given x ∈ RD and d < D, the output of an affine coupling layer, y , is:

y1:d = x1:d , (2)

yd+1:D = xd+1:D ◦ exp (s(x1:d)) + t(x1:d), (3)

where s, t : Rd → RD−d are arbitrary functions.(arbitrary means deep convolutional neural networks . . . )

Since we are interested in sampling and inference, what about the inverseand the Jacobian determinant?

Page 25: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Inverse

f :

y1:d = x1:d ,

yd+1:D = xd+1:D ◦ exp (s(x1:d)) + t(x1:d),

g :

x1:d = y1:d , (4)

xd+1:D = (yd+1:D − t(y1:d)) ◦ exp (−s(y1:d)) (5)

Sampling is as efficient as inference.

No need to invert s or t.

Page 26: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Inverse

Page 27: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Jacobian

The Jacobian has the following form:

∂y

∂xT=

[Id 0

∂yd+1:D

∂xT1:d

diag(exp s(x1:d))

]. (6)

Determinant: exp

(∑j

s(x1:d)j

). Easy to compute.

No need to compute Jacobian of s or t.

Page 28: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Masked Convolution

Using a binary mask b, the output of the coupling layer can be written as

y = b ◦ x + (1− b) ◦ (x ◦ exp (s(b ◦ x)) + t(b ◦ x)). (7)

We will see examples of masks shortly.

Page 29: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Combining Coupling Layers

A coupling layer leaves some components of the input unchanged.

Page 30: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Combining Coupling Layers

A coupling layer leaves some components of the input unchanged.

→ Compose such layers in an alternating pattern.

Page 31: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Combining Coupling Layers

A coupling layer leaves some components of the input unchanged.

→ Compose such layers in an alternating pattern.

Page 32: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Combining Coupling Layers

Inverse and Jacobian determinant still ok to compute because of thefollowing:

(fb ◦ fa)−1 = f −1a ◦ f −1

b , (8)

∂(fb ◦ fa)

∂xTa(xa) =

∂faxTa

(xa)∂fbxTb

(xb = fa(xa)), (9)

det(AB) = det(A)det(B). (10)

Page 33: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Multi-Scale Architecture

Implement a multi-scale architecture using squeezing.→ (s, s, c)→ ( s

2 ,s2 , 4c).

→ Trade spatial size for number of channels.

At each scale:→ Apply 3 coupling layers with alternating checkerboard masks.→ Squeeze.→ Apply 3 coupling layers with alternating channel-wise masks.→ Only apply 4 coupling layers with alternating checkerboard

masks in the final scale.

Page 34: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Multi-Scale Architecture

Implement a multi-scale architecture using squeezing.→ (s, s, c)→ ( s

2 ,s2 , 4c).

→ Trade spatial size for number of channels.

At each scale:→ Apply 3 coupling layers with alternating checkerboard masks.→ Squeeze.→ Apply 3 coupling layers with alternating channel-wise masks.→ Only apply 4 coupling layers with alternating checkerboard

masks in the final scale.

Page 35: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Multi-Scale Architecture

It is a pain to propagate all D dimensions across all layers.→ Memory.→ Computational cost.→ Number of trainable parameters.

Page 36: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Coupling Layers

Multi-Scale Architecture

Factor out half the dimensions at regular intervals.

Gaussianize such units, concatenate to obtain final output.→ Distributes the loss throughout the network.→ Learns local, fine-grained features.

Page 37: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Experiments

Results and Samples

(Show paper.)

Page 38: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

Summary

Summary

Use change of variable to express likelihood of data.

Use coupling layers to define bijection between data and latentspaces.

Train using maximum likelihood.

Can perform exact and efficient

inference,sampling,

Page 39: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

References

Dinh, Laurent, Jascha Sohl-Dickstein, and Samy Bengio.

”Density estimation using Real NVP.”

arXiv preprint arXiv:1605.08803 (2016).

Oord, Aaron van den, Nal Kalchbrenner, and Koray Kavukcuoglu.

”Pixel recurrent neural networks.”

arXiv preprint arXiv:1601.06759 (2016).

Page 40: Deep Generative Models: Beyond GANs · Deep Generative Models: Beyond GANs Raunak Kumar University of British Columbia MLRG, 2017 Winter Term 1 November 28, 2017. Problem Setup Density

Problem Setup Density Estimation using Real NVP

The End