65
Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head of Bayesian methods research group http://bayesgroup.ru

Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Bayesian Methods in Machine Learning

Dmitry Vetrov

Research professor at faculty of Compter Sciences HSE

Head of Bayesian methods research group

http://bayesgroup.ru

Page 2: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Outline

• Intro to mathematics of big data

• Bayesian framework

• Latent variable models

• Deep Bayes

Page 3: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

What is machine learning?

Page 4: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Simple example

Page 5: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Areas of application

Page 6: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Milestones

Page 7: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Entering the Age of Big Data

Data

Computationalspeed

Page 8: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

First Steps towards Mathematics of Big Data

Page 9: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Deep learning: why now?

Page 10: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Conditional and marginal distributions

Page 11: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Bayesian framework

Page 12: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Frequentist vs. Bayesian frameworks

Page 13: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Bayesian inference

Page 14: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Bayesian inference

Page 15: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Bayesian inference

Page 16: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Bayesian inference

Page 17: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Bayesian Learning and Inference

Page 18: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Poor man’s Bayes

Page 19: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Regularization

Prior term

Page 20: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Learning from incomplete data

Page 21: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Advantages of Bayesian framework

• Regularization• Incorporates specifics of particular problem

• Extendibility• Builds complex model from simpler ones

• Latent variable modeling• Learns from incomplete data

• Ensembling• Performs weighted voting across multiple algorithms

• Scalability (new!)• Applicable to large datasets when combined with deep neural

networks

Page 22: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Exponential class of distributions

Page 23: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Log-concavity of exponential class

Page 24: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Log-concavity of exponential class

Page 25: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Example: Gaussian distribution

Page 26: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Incomplete likelihood

Page 27: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Variational lower bound

Page 28: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

EM-algorithm

Page 29: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

EM-algorithm

Page 30: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

EM-algorithm

Page 31: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

EM-algorithm

Page 32: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

EM-algorithm

Page 33: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

EM-algorithm

Page 34: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

EM-algorithm

Page 35: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Discrete T

Page 36: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Mixture of gaussians

Page 37: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Mixture of gaussians

Page 38: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Mixture of gaussians

Page 39: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Mixture of gaussians

Page 40: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Mixture of gaussians

Page 41: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Mixture of gaussians

Page 42: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Mixture of gaussians: formal description

Page 43: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

EM-algorithm for mixture of gaussians

Page 44: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Continuous T

Page 45: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Example: PCA model

Page 46: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Advantages of EM PCA

Page 47: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Mixture of PCA

Page 48: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Example: Latent Dirichlet Allocation

Page 49: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

LDA: formal description

Page 50: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

General nature of EM-framework

Page 51: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Variational inference: way to complex Bayesian models

Page 52: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Variational inference: way to complex Bayesian models

The most difficult part

Page 53: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Variational inference: way to complex Bayesian models

The most difficult part

Page 54: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Dropout

Page 55: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Variational Dropout

Diederik P. Kingma, Tim Salimans, Max Welling. Variational Dropout and the Local Reparameterization Trick. arXiv:1506.02557

Page 56: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Variational Dropout

Page 57: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Adjusting dropout rates

Page 58: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Individual dropout rate

Page 59: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Alternative view on dropout

Page 60: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Alternative view on dropout

Data term Regularization

Page 61: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Alternative view on dropout

. . .Deterministic

output

Gaussiannoise

Noisyoutput

Page 62: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Sparsifying deep NNs

99.2% of zeros

Page 63: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Sparsifying deep NNs

Page 64: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Summer school

Page 65: Bayesian Methods in Machine Learning”митрий Ветров... · Bayesian Methods in Machine Learning Dmitry Vetrov Research professor at faculty of Compter Sciences HSE Head

Conclusion

• Bayesian framework is extremely powerful and extends ML tools

• We do have scalable algorithms for approximate Bayesian inference

• Bayes + Deep Learning =

• Even the first attempts of neurobayesian inference give impressive results