Skip to Main content Skip to Navigation
Preprints, Working Papers, ...

A note on replacing uniform subsampling by random projections in MCMC for linear regression of tall datasets

Rémi Bardenet 1, * Odalric-Ambrym Maillard 2, *
* Corresponding author
2 TAO - Machine Learning and Optimisation
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : New Markov chain Monte Carlo (MCMC) methods have been proposed to tackle inference with tall datasets, i.e., when the number n of data items is intractably large. A large class of these new MCMC methods is based on randomly subsampling the dataset at each MCMC iteration. We investigate whether random projections can replace this random subsampling for linear regression of big streaming data. In the latter setting, random projections have indeed become standard for non-Bayesian treatments. We isolate two issues for MCMC to apply to streaming regression: 1) a resampling issue; MCMC should access the same random projections across iterations to avoid keeping the whole dataset in memory and 2) a budget issue; making individual MCMC acceptance decisions should require o(n) random projections. While the resampling issue can be satisfyingly tackled, current techniques in random projections and MCMC for tall data do not solve the budget issue, and may well end up showing it is not possible.
Document type :
Preprints, Working Papers, ...
Complete list of metadata

Cited literature [36 references]  Display  Hide  Download
Contributor : Rémi Bardenet <>
Submitted on : Tuesday, December 29, 2015 - 12:44:04 PM
Last modification on : Friday, April 30, 2021 - 9:55:46 AM


Files produced by the author(s)


  • HAL Id : hal-01248841, version 1


Rémi Bardenet, Odalric-Ambrym Maillard. A note on replacing uniform subsampling by random projections in MCMC for linear regression of tall datasets. 2015. ⟨hal-01248841⟩



Record views


Files downloads