The Smooth-Lasso and other $\ell_1+\ell_2$-penalized methods

Mohamed Hebiri; Sara A. van de Geer

Pré-Publication, Document De Travail Année : 2011

The Smooth-Lasso and other $\ell_1+\ell_2$-penalized methods

(1, 2) , (1)

1
2

Mohamed Hebiri

Fonction : Auteur
PersonId : 839841

Seminar für Statistik

Laboratoire d'Analyse et de Mathématiques Appliquées

Sara A. van de Geer

Fonction : Auteur

Seminar für Statistik

Résumé

We consider a linear regression problem in a high dimensional setting where the number of covariates $p$ can be much larger than the sample size $n$. In such a situation, one often assumes sparsity of the regression vector, \textit{i.e.}, the regression vector contains many zero components. We propose a Lasso-type estimator $\hat{\beta}^{Quad}$ (where '$Quad$' stands for quadratic) which is based on two penalty terms. The first one is the $\ell_1$ norm of the regression coefficients used to exploit the sparsity of the regression as done by the Lasso estimator, whereas the second is a quadratic penalty term introduced to capture some additional information on the setting of the problem. We detail two special cases: the Elastic-Net $\hat{\beta}^{EN}$, which deals with sparse problems where correlations between variables may exist; and the Smooth-Lasso $\hat{\beta}^{SL}$, which responds to sparse problems where successive regression coefficients are known to vary slowly (in some situations, this can also be interpreted in terms of correlations between successive variables). From a theoretical point of view, we establish variable selection consistency results and show that $\hat{\beta}^{Quad}$ achieves a Sparsity Inequality, \textit{i.e.}, a bound in terms of the number of non-zero components of the 'true' regression vector. These results are provided under a weaker assumption on the Gram matrix than the one used by the Lasso. In some situations this guarantees a significant improvement over the Lasso. Furthermore, a simulation study is conducted and shows that the S-Lasso $\hat{\beta}^{SL}$ performs better than known methods as the Lasso, the Elastic-Net $\hat{\beta}^{EN}$, and the Fused-Lasso with respect to the estimation accuracy. This is especially the case when the regression vector is 'smooth', \textit{i.e.}, when the variations between successive coefficients of the unknown parameter of the regression are small. The study also reveals that the theoretical calibration of the tuning parameters and the one based on $10$ fold cross validation imply two S-Lasso solutions with close performance.

Mots clés

Lasso Elastic-Net LARS Sparsity Variable selection Restricted eigenvalues High-dimensional data

Domaines

Statistiques [math.ST] Théorie [stat.TH]

Fichier principal

SLassoV2.pdf (579.61 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Mohamed Hebiri : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00462882

Soumis le : vendredi 7 octobre 2011-15:40:56

Dernière modification le : jeudi 14 mars 2024-03:09:26

Archivage à long terme le : dimanche 8 janvier 2012-02:30:56

Dates et versions

hal-00462882 , version 1 (10-03-2010)

hal-00462882 , version 2 (07-10-2011)

Identifiants

HAL Id : hal-00462882 , version 2
ARXIV : 1003.4885

Citer

Mohamed Hebiri, Sara A. van de Geer. The Smooth-Lasso and other $\ell_1+\ell_2$-penalized methods. 2011. ⟨hal-00462882v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-MLV LAMA_UMR8050 CV_LAMA_UMR8050 LAMA_PS UPEC UNIV-EIFFEL

166 Consultations

899 Téléchargements

The Smooth-Lasso and other $\ell_1+\ell_2$-penalized methods

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager