A sparse version of the ridge logistic regression for large-scale text categorization

Sujeevan Aseervatham 1 Anestis Antoniadis 2 Éric Gaussier 3, * Michel Burlet 4 Yves Denneulin 5
* Corresponding author
2 SAM - Statistique Apprentissage Machine
LJK - Laboratoire Jean Kuntzmann
4 G-SCOP_OC - OC
G-SCOP - Laboratoire des sciences pour la conception, l'optimisation et la production
5 MESCAL - Middleware efficiently scalable
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : The ridge logistic regression has successfully been used in text categorization problems and it has been shown to reach the same performance as the Support Vector Machine but with the main advantage of computing a probability value rather than a score. However, the dense solution of the ridge makes its use unpractical for large scale categorization. On the other side, LASSO regularization is able to produce sparse solutions but its performance is dominated by the ridge when the number of features is larger than the number of observations and/or when the features are highly correlated. In this paper, we propose a new model selection method which tries to approach the ridge solution by a sparse solution. The method first computes the ridge solution and then performs feature selection. The experimental evaluations show that our method gives a solution which is a good trade-off between the ridge and LASSO solutions.
Liste complète des métadonnées

Cited literature [21 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00633629
Contributor : Marie Josèphe Perruet <>
Submitted on : Monday, October 15, 2012 - 6:21:33 PM
Last modification on : Thursday, February 7, 2019 - 5:48:16 PM
Document(s) archivé(s) le : Wednesday, January 16, 2013 - 2:35:10 AM

File

Aseervatham-PatternRecognition...
Files produced by the author(s)

Identifiers

Citation

Sujeevan Aseervatham, Anestis Antoniadis, Éric Gaussier, Michel Burlet, Yves Denneulin. A sparse version of the ridge logistic regression for large-scale text categorization. Pattern Recognition Letters, Elsevier, 2011, 32 (2), pp.101-106. ⟨10.1016/j.patrec.2010.09.023⟩. ⟨hal-00633629⟩

Share

Metrics

Record views

941

Files downloads

384