Bolasso: model consistent Lasso estimation through the bootstrap

Francis Bach 1
1 WILLOW - Models of visual object recognition and scene understanding
CNRS - Centre National de la Recherche Scientifique : UMR8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique de l'École normale supérieure
Abstract : We consider the least-square linear regression problem with regularization by the l1-norm, a problem usually referred to as the Lasso. In this paper, we present a detailed asymptotic analysis of model consistency of the Lasso. For various decays of the regularization parameter, we compute asymptotic equivalents of the probability of correct model selection (i.e., variable selection). For a specific rate decay, we show that the Lasso selects all the variables that should enter the model with probability tending to one exponentially fast, while it selects all other variables with strictly positive probability. We show that this property implies that if we run the Lasso for several bootstrapped replications of a given sample, then intersecting the supports of the Lasso bootstrap estimates leads to consistent model selection. This novel variable selection algorithm, referred to as the Bolasso, is compared favorably to other linear regression methods on synthetic data and datasets from the UCI machine learning repository.
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00271289
Contributor : Francis Bach <>
Submitted on : Tuesday, April 8, 2008 - 5:36:50 PM
Last modification on : Wednesday, January 30, 2019 - 10:43:30 AM
Long-term archiving on : Tuesday, September 21, 2010 - 4:22:19 PM

Files

hal_bolasso.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00271289, version 2
  • ARXIV : 0804.1302

Collections

Citation

Francis Bach. Bolasso: model consistent Lasso estimation through the bootstrap. 2008. ⟨hal-00271289v2⟩

Share

Metrics

Record views

435

Files downloads

238