Sparse Oracle Inequalities for Variable Selection via Regularized Quantization

Abstract : We give oracle inequalities on procedures which combines quantization and variable selection via a weighted Lasso $k$-means type algorithm. The results are derived for a general family of weights, which can be tuned to size the influence of the variables in different ways. Moreover, these theoretical guarantees are proved to adapt the corresponding sparsity of the optimal codebooks, if appropriate. Even if there is no sparsity assumption on the optimal codebooks, our procedure is proved to be close to a sparse approximation of the optimal codebooks, as has been done for the Generalized Linear Models in regression. If the optimal codebooks have a sparse support, we also show that this support can be asymptotically recovered, giving an asymptotic upper bound on the probability of misclassification. These results are illustrated with Gaussian mixture models in arbitrary dimension with sparsity assumptions on the means, which are standard distributions in model-based clustering.
Type de document :
Pré-publication, Document de travail
2015
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01005545
Contributeur : Clément Levrard <>
Soumis le : mercredi 6 juillet 2016 - 11:28:35
Dernière modification le : mercredi 29 novembre 2017 - 16:45:50
Document(s) archivé(s) le : vendredi 7 octobre 2016 - 10:28:30

Fichiers

Sparseoracleinequalitiesforfea...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01005545, version 3
  • ARXIV : 1406.3334

Collections

INSMI | UPMC | USPC | PMA

Citation

Clément Levrard. Sparse Oracle Inequalities for Variable Selection via Regularized Quantization. 2015. 〈hal-01005545v3〉

Partager

Métriques

Consultations de la notice

165

Téléchargements de fichiers

37