1
Exploring generative models with Python Javier Jorge Cano PyCon 2018 Problem definition Several of the successful applications of machine learning techniques are based on the amount of data currently available. But sometimes data is scarce, i.e: difficult processes to collect it (i.e: medical imaging). We can obtain more data performing controlled distortions that do not modify the true nature of the sample, this is called data augmentation. Figure 1: Left to right: original image and different alterations. Figure 2: Another example of simulating different positions and rotations with different transformations. These transformations are hand-crafted and problem- dependent. Could we provide a domain-agnostic approach to do this? Generative models Neural Networks are good at classifying, meaning that they learn a “mapping” between the input and the output. Figure 3: Example of DNN-based classifier. We can force this map to mirror the input, so we end up with a model that can reconstruct the sample. Figure 4: Example of DNN-based reconstruction model. These models usually encode the input in some manifold, and then decode it to recover the original sample. This kind of models are known as Generative models. Could we use this model’s “inner representation” to cre- ate new outputs? Could these artificial samples be useful? (i.e: for training classifiers) Two techniques dominate recent approaches to deal with these problems. One line of research is based on Gener- ative Adversarial Networks (GAN) and the other one is based on Variational Autoencoders (VAE). GAN: Based on the mixture of Game Theory and Machine Learning, where two networks (Generator and Discriminator) are competing in a game where one wants to fool the other. G D T/F x t z t ˆ x t Figure 5: Discriminator learns to decide if a sample is true or fake, while Generator aims to fool the discriminator. VAE: Based on the principle of encoding-decoding, these models aim to minimize the reconstruction error (i.e: Mean squared error) while constraining the inner representation to be similar to a simple distribution (i.e: Gaussian). σ μ x t ˆ x t z t = μ t + σ t ϵ ϵ ∼N (0, 1) ϵ Figure 6: VAE architecture, the input x t is projected in the manifold, obtaining the vector z t . The decoder recovers this vector getting the reconstructed input ˆ x t . Figure 7: Left: 2D Latent space provided by the model, trained with MNIST dataset. Right: Sampling the 2D manifold and getting the images. We are going to use VAE as they can provide the manifold directly. With this manifold we can perform different modi- fications over the original data. VAE for data augmentation 1 Train VAE with your training set and use a validation set in order to check the model’s evolution. 2 Project your training set using the model, obtaining the projected versions z t for each sample. 3 Alter these z t vectors, and then reconstruct them. z 1 z 2 z t+1 z t Figure 8: Encoding the input, modifying it and then reconstructing the altered sample with the decoder. Different methods to alter the projected samples: Initial projection Applying noise Performing interpolation Performing extrapolation Figure 9: 2D example manifold’s transformations. With extrapolation With interpolation Adding noise Real Instances Figure 10: Original input and the different outputs depending on the modification, each column corresponds to the same digit. Figure 11: Smooth transformations that capture changes in the scale and rotation, even changes among classes. Case use: UJIPEN Dataset Labels Dim. Tr. Val. Ts. 97 4,9k (70x70 px) 5,820 582 4,656 Table 1: Dataset distribution. Figure 12: Some samples from UJIPEN. Generation methods Classifier Baseline Noise Interpolation Extrapolation Support vector machines 54.79 59.09 63.48 63.19 K-nearest neighbors 31.86 39.29 53.15 46.74 Neural Network 63.99 57.20 65.60 64.09 Table 2: UJIPEN results, accuracy on test set (%)[1]. Figure 13: Top row: Samples after interpolating. Bottom row: Samples after extrapolating. Figure 14: Linear interpolation between samples from the same class. Conclusions We used a generative model, a VAE in particular, to create synthetic samples. We performed different alterations in the inner representation that helped the selected classifiers. For future work, we want to use this model in different data: text, speech, feature vectors, etc. Python For Science: All the experiments were performed with: Numpy. TensorFlow. Matplotlib. Reference [1] J. Jorge, J. Vieco, R. Paredes, J. A. Sanchez, and J. M. Benedí. Empirical evaluation of variational autoencoders for data augmentation. In Proceedings of the 13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 5: VISAPP, pages 96–104. INSTICC, SciTePress, 2018.

Exploring generative models with Python - Not a Numberjjorge.es/.../05/exploring-generative-models-pycon2018.pdf · 2018. 5. 13. · Exploring generative models with Python Javier

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Exploring generative models with Python - Not a Numberjjorge.es/.../05/exploring-generative-models-pycon2018.pdf · 2018. 5. 13. · Exploring generative models with Python Javier

Exploring generative models with PythonJavier Jorge Cano

PyCon 2018

Problem definition

• Several of the successful applications of machinelearning techniques are based on the amount of data currentlyavailable.

• But sometimes data is scarce, i.e: difficult processes to collectit (i.e: medical imaging).

• We can obtain more data performing controlled distortionsthat do not modify the true nature of the sample, this iscalled data augmentation.

Figure 1: Left to right: original image and different alterations.

Figure 2: Another example of simulating different positions androtations with different transformations.

• These transformations are hand-crafted and problem-dependent. Could we provide a domain-agnosticapproach to do this?

Generative models

• Neural Networks are good at classifying, meaning that theylearn a “mapping” between the input and the output.

Figure 3: Example of DNN-based classifier.

• We can force this map to mirror the input, so we end upwith a model that can reconstruct the sample.

Figure 4: Example of DNN-based reconstruction model.

• These models usually encode the input in some manifold, andthen decode it to recover the original sample.

• This kind of models are known as Generative models.• Could we use this model’s “inner representation” to cre-ate new outputs?

• Could these artificial samples be useful? (i.e: for trainingclassifiers)

• Two techniques dominate recent approaches to deal withthese problems. One line of research is based on Gener-ative Adversarial Networks (GAN) and the other oneis based on Variational Autoencoders (VAE).

• GAN: Based on the mixture of Game Theory and MachineLearning, where two networks (Generator and Discriminator)are competing in a game where one wants to fool the other.

GD T/F

xt

zt

x̂t

Figure 5: Discriminator learns to decide if a sample is true or fake,while Generator aims to fool the discriminator.

• VAE: Based on the principle of encoding-decoding, thesemodels aim to minimize the reconstruction error (i.e: Meansquared error) while constraining the inner representation tobe similar to a simple distribution (i.e: Gaussian).

σ

µ

xt x̂t

zt = µt + σt ∗ ϵ

ϵ ∼ N (0, 1)

ϵ

Figure 6: VAE architecture, the input xt is projected in themanifold, obtaining the vector zt. The decoder recovers this vectorgetting the reconstructed input x̂t.

Figure 7: Left: 2D Latent space provided by the model, trainedwith MNIST dataset. Right: Sampling the 2D manifold and gettingthe images.

• We are going to use VAE as they can provide the manifolddirectly. With this manifold we can perform different modi-fications over the original data.

VAE for data augmentation

1 Train VAE with your training set and use a validation set inorder to check the model’s evolution.

2 Project your training set using the model, obtaining theprojected versions zt for each sample.

3 Alter these zt vectors, and then reconstruct them.

z1

z2 zt+1zt

Figure 8: Encoding the input, modifying it and then reconstructingthe altered sample with the decoder.

Different methods to alter the projected samples:

Initial projection

Applying noise

Performing interpolation

Performing extrapolationFigure 9: 2D example manifold’s transformations.

With extrapolation

With interpolation

Adding noise

Real Instances

Figure 10: Original input and the different outputs depending on themodification, each column corresponds to the same digit.

Figure 11: Smooth transformations that capture changes in the scaleand rotation, even changes among classes.

Case use: UJIPEN Dataset

Labels Dim. Tr. Val. Ts.97 4,9k (70x70 px) 5,820 582 4,656

Table 1: Dataset distribution.

Figure 12: Some samples from UJIPEN.

Generation methodsClassifier Baseline Noise Interpolation Extrapolation

Support vector machines 54.79 59.09 63.48 63.19K-nearest neighbors 31.86 39.29 53.15 46.74Neural Network 63.99 57.20 65.60 64.09

Table 2: UJIPEN results, accuracy on test set (%)[1].

Figure 13: Top row: Samples after interpolating. Bottom row:Samples after extrapolating.

Figure 14: Linear interpolation between samples from the same class.

Conclusions

• We used a generative model, a VAE in particular, to createsynthetic samples.

• We performed different alterations in the inner representationthat helped the selected classifiers.

• For future work, we want to use this model in different data:text, speech, feature vectors, etc.

Python For Science: All the experiments were performedwith:• Numpy.• TensorFlow.• Matplotlib.

Reference

[1] J. Jorge, J. Vieco, R. Paredes, J. A. Sanchez, and J. M. Benedí.Empirical evaluation of variational autoencoders for dataaugmentation.In Proceedings of the 13th International Joint Conference onComputer Vision, Imaging and Computer Graphics Theory andApplications - Volume 5: VISAPP, pages 96–104. INSTICC,SciTePress, 2018.