Finite mixture regression: a sparse variable selection by model selection for clustering.

Emilie Devijver 1, 2
1 SELECT - Model selection in statistical learning
Inria Saclay - Ile de France, LMO - Laboratoire de Mathématiques d'Orsay, CNRS - Centre National de la Recherche Scientifique : UMR
Abstract : We consider a finite mixture of Gaussian regression model for high- dimensional data, where the number of covariates may be much larger than the sample size. We propose to estimate the unknown conditional mixture density by a maximum likelihood estimator, restricted on relevant variables selected by an 1-penalized maximum likelihood estimator. We get an oracle inequality satisfied by this estimator with a Jensen-Kullback-Leibler type loss. Our oracle inequality is deduced from a general model selection theorem for maximum likelihood estimators with a random model collection. We can derive the penalty shape of the criterion, which depends on the complexity of the random model collection.
Type de document :
Article dans une revue
Electronic journal of statistics , Shaker Heights, OH : Institute of Mathematical Statistics, 2015
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01060079
Contributeur : Emilie Devijver <>
Soumis le : mercredi 3 septembre 2014 - 21:26:29
Dernière modification le : jeudi 11 janvier 2018 - 06:22:14
Document(s) archivé(s) le : jeudi 4 décembre 2014 - 10:10:48

Fichiers

inegOracleProc.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

  • HAL Id : hal-01060079, version 1
  • ARXIV : 1409.1331

Citation

Emilie Devijver. Finite mixture regression: a sparse variable selection by model selection for clustering.. Electronic journal of statistics , Shaker Heights, OH : Institute of Mathematical Statistics, 2015. 〈hal-01060079〉

Partager

Métriques

Consultations de la notice

570

Téléchargements de fichiers

292