Dual Extrapolation for Sparse Generalized Linear Models

Generalized Linear Models (GLM) form a wide class of regression and classification models, where prediction is a function of a linear combination of the input variables. For statistical inference in high dimension, sparsity inducing regularizations have proven to be useful while offering statistical guarantees. However, solving the resulting optimization problems can be challenging: even for popular iterative algorithms such as coordinate descent, one needs to loop over a large number of variables. To mitigate this, techniques known as screening rules and working sets diminish the size of the optimization problem at hand, either by progressively removing variables, or by solving a growing sequence of smaller problems. For both techniques, significant variables are identified thanks to convex duality arguments. In this paper, we show that the dual iterates of a GLM exhibit a Vector AutoRegressive (VAR) behavior after sign identification, when the primal problem is solved with proximal gradient descent or cyclic coordinate descent. Exploiting this regularity, one can construct dual points that offer tighter certificates of optimality, enhancing the performance of screening rules and helping to design competitive working set algorithms.

Mots clés

generalized linear models sparse logistic regression Lasso working sets screening rules extrapolation Convex optimization

Domaines

Machine Learning [stat.ML] Optimisation et contrôle [math.OC]

Fichier principal

main.pdf (2.84 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Mathurin Massias : Connectez-vous pour contacter le contributeur

https://hal.science/hal-02263500

Soumis le : lundi 5 août 2019-08:21:13

Dernière modification le : mercredi 3 avril 2024-10:20:13

Archivage à long terme le : mercredi 8 janvier 2020-20:23:20

Dates et versions

hal-02263500 , version 1 (05-08-2019)

Identifiants

HAL Id : hal-02263500 , version 1

Citer

Mathurin Massias, Samuel Vaiter, Alexandre Gramfort, Joseph Salmon. Dual Extrapolation for Sparse Generalized Linear Models. Journal of Machine Learning Research, 2020, 21 (234), pp.1-33. ⟨hal-02263500⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CEA UNIV-BOURGOGNE CNRS INRIA I3M_UMR5149 IMB_UMR5584 IMAG-MONTPELLIER INRIA2 CEA-UPSAY TDS-MACS UNIV-PARIS-SACLAY MIPS UNIV-MONTPELLIER JOLIOT CEA-DRF NEUROSPIN GS-ENGINEERING GS-COMPUTER-SCIENCE GS-LIFE-SCIENCES-HEALTH

230 Consultations

150 Téléchargements