Bayesian Variable Selection for Globally Sparse Probabilistic PCA

Abstract : Sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features of high-dimensional data in an unsupervised manner. However, when several sparse principal components are computed, the interpretation of the selected variables is difficult since each axis has its own sparsity pattern and has to be interpreted separately. To overcome this drawback, we propose a Bayesian procedure called globally sparse probabilistic PCA (GSPPCA) that allows to obtain several sparse components with the same sparsity pattern. This allows the practitioner to identify the original variables which are relevant to describe the data. To this end, using Roweis' prob-abilistic interpretation of PCA and a Gaussian prior on the loading matrix, we provide the first exact computation of the marginal likelihood of a Bayesian PCA model. To avoid the drawbacks of discrete model selection, a simple relaxation of this framework is presented. It allows to find a path of models using a variational expectation-maximization algorithm. The exact marginal likelihood is then maximized over this path. This approach is illustrated on real and synthetic data sets. In particular, using unlabeled microarray data, GSPPCA infers much more relevant gene subsets than traditional sparse PCA algorithms.
Type de document :
Pré-publication, Document de travail
An earlier version of this paper appeared in the Proceedings of the 19th International Conference.. 2016
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01310409
Contributeur : Pierre-Alexandre Mattei <>
Soumis le : mardi 20 septembre 2016 - 18:04:46
Dernière modification le : mardi 11 octobre 2016 - 13:24:22

Fichier

GSPPCAv2.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

  • HAL Id : hal-01310409, version 2
  • ARXIV : 1605.05918

Collections

Citation

Charles Bouveyron, Pierre Latouche, Pierre-Alexandre Mattei. Bayesian Variable Selection for Globally Sparse Probabilistic PCA. An earlier version of this paper appeared in the Proceedings of the 19th International Conference.. 2016. <hal-01310409v2>

Partager

Métriques

Consultations de
la notice

207

Téléchargements du document

80