2
Statistics Seminar Joint organization of statistics research groups Faculty of Science and Faculty of Economics and Business Leuven Statistics Research Centre (LSTAT) Prof. Dr Mauro Bernardi Department of Statistical Sciences, University of Padua, Italy. “Fast Bayesian model selection algorithms for linear regression models” Thursday, December 3, 2020 12:00—13:00 Online seminar, via Skype for Business. Supporting research projects: FWO-WOG ATMS. Abstract. In the Bayesian framework, the issue of model selection for high-dimensional linear regression has been primarily addressed by assuming hierarchical mixtures as prior distributions (see, e.g. Mitchell and Beauchamp, 1988; George and McCulloch, 1993, 1997; Rockov´ a and George, 2014). A spike component with Dirac probability mass at zero is introduced to exclude irrelevant covariates, thereby leading to Bayesian selection procedures that rely on the compu- tation of the marginal posterior distribution for alternative model configurations. Within this framework, evaluation of all possible models is prohibitive even for moderately large dimensions, therefore exploration of the posterior distribution of competing models is usually performed by means of computationally intensive Markov chain Monte Carlo techniques (see, e.g. Fan and Sisson, 2011; Hastie and Green, 2012). This talk presents two methods that have been developed to address the relevant problem of updating the variance of the posterior distribution and the marginal posterior density itself, after a modification of the current design matrix. Firstly, novel algorithms are proposed that leverage a thin QR factorization (Golub and Van Loan, 2013) to update the posterior variance after the modification of the current design matrix and com- pletely disregard the Q matrix thus allowing noticeable savings. Then, the evaluation of the marginal posterior, which represents the bottleneck of any Bayesian model selection procedure, is considered. It is shown that its computation relies on the inverse of the R matrix, hence 11

Statistics Seminar · 2020. 10. 28. · Statistics Seminar Joint organization of statistics research groups Faculty of Science and Faculty of Economics and Business Leuven Statistics

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Statistics Seminar · 2020. 10. 28. · Statistics Seminar Joint organization of statistics research groups Faculty of Science and Faculty of Economics and Business Leuven Statistics

Statistics SeminarJoint organization of statistics research groups Faculty of Science and

Faculty of Economics and BusinessLeuven Statistics Research Centre (LSTAT)

Prof. Dr Mauro BernardiDepartment of Statistical Sciences, University of Padua, Italy.

“Fast Bayesian model selection algorithms forlinear regression models”

Thursday, December 3, 2020

12:00—13:00

Online seminar, via Skype for Business.Supporting research projects: FWO-WOG ATMS.

Abstract. In the Bayesian framework, the issue of model selection for high-dimensional linearregression has been primarily addressed by assuming hierarchical mixtures as prior distributions(see, e.g. Mitchell and Beauchamp, 1988; George and McCulloch, 1993, 1997; Rockova andGeorge, 2014). A spike component with Dirac probability mass at zero is introduced to excludeirrelevant covariates, thereby leading to Bayesian selection procedures that rely on the compu-tation of the marginal posterior distribution for alternative model configurations. Within thisframework, evaluation of all possible models is prohibitive even for moderately large dimensions,therefore exploration of the posterior distribution of competing models is usually performed bymeans of computationally intensive Markov chain Monte Carlo techniques (see, e.g. Fan andSisson, 2011; Hastie and Green, 2012). This talk presents two methods that have been developedto address the relevant problem of updating the variance of the posterior distribution and themarginal posterior density itself, after a modification of the current design matrix. Firstly, novelalgorithms are proposed that leverage a thin QR factorization (Golub and Van Loan, 2013)to update the posterior variance after the modification of the current design matrix and com-pletely disregard the Q matrix thus allowing noticeable savings. Then, the evaluation of themarginal posterior, which represents the bottleneck of any Bayesian model selection procedure,is considered. It is shown that its computation relies on the inverse of the R matrix, hence

11

Page 2: Statistics Seminar · 2020. 10. 28. · Statistics Seminar Joint organization of statistics research groups Faculty of Science and Faculty of Economics and Business Leuven Statistics

a new method is presented which computes the marginal posterior of the model with an up-dated design matrix using directly the inverse of the R matrix. These methods do not needcomputationally intensive inversions of large dimensional matrices when performing marginalposterior evaluations. Exact and asymptotic computational costs in terms of number of floatingpoint operations have been calculated to measure the e�ciency of the proposed methods. Theproposed methodology have been tested on simulated and real datasets, in order to compare theperformances of the new algorithms to existing alternatives in terms of computational time andmodels space exploration.

This is joint work with Claudio Busatto and Manuela Cattelan.

References

Fan, Y. & Sisson, S. A. (2011). Reversible jump MCMC. In Brooks, S., Gelman, A., Jones, G., andMeng, X., editors, Handbook of Markov Chain Monte Carlo, chapter 3, pages 6792. New York:Chapman and Hall/CRC.

George, E. I. & McCulloch, R. E. (1993). Variable selection via gibbs sampling. Journal of the AmericanStatistical Association, 88(423):881–889.

George, E. I. & McCulloch, R. E. (1997). Approaches for bayesian variable selection. Statistica Sinica,7(2):339–373.

Golub, G. H. & Van Loan, C. F. (2013). Matrix computations. Johns Hopkins Studies in the Mathem-atical Sciences. Johns Hopkins University Press, Baltimore, MD, fourth edition.

Hastie, D. I. & Green, P. J. (2012). Model choice using reversible jump markov chain monte carlo.Statistica Neerlandica, 66(3):309–338.

Mitchell, T. J. & Beauchamp, J. J. (1988). Bayesian variable selection in linear regression. Journal ofthe American Statistical Association, 83(404):1023–1032.

Rockova, V. & George, E. I. (2014). Emvs: The em approach to bayesian variable selection. Journal ofthe American Statistical Association, 109(506):828– 846.

12