Model-Based Clustering of High-Dimensional Data: A review

Abstract : Model-based clustering is a popular tool which is renowned for its probabilistic foundations and its flexibility. However, high-dimensional data are nowadays more and more frequent and, unfortunately, classical model-based clustering techniques show a disappointing behavior in high-dimensional spaces. This is mainly due to the fact that model-based clustering methods are dramatically over-parametrized in this case. However, high-dimensional spaces have specific characteristics which are useful for clustering and recent techniques exploit those characteristics. After having recalled the bases of model-based clustering, this article will review dimension reduction approaches, regularization-based techniques, parsimonious modeling, subspace clustering methods and clustering methods based on variable selection. Existing softwares for model-based clustering of high-dimensional data will be also reviewed and their practical use will be illustrated on real-world data sets.
Type de document :
Article dans une revue
Computational Statistics and Data Analysis, Elsevier, 2013, 71, pp.52-78. <10.1016/j.csda.2012.12.008>
Liste complète des métadonnées


https://hal.archives-ouvertes.fr/hal-00750909
Contributeur : Charles Bouveyron <>
Soumis le : lundi 12 novembre 2012 - 16:31:30
Dernière modification le : jeudi 9 février 2017 - 01:06:05
Document(s) archivé(s) le : mercredi 13 février 2013 - 03:46:26

Fichier

hal_ReviewHD.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Collections

Citation

Charles Bouveyron, Camille Brunet. Model-Based Clustering of High-Dimensional Data: A review. Computational Statistics and Data Analysis, Elsevier, 2013, 71, pp.52-78. <10.1016/j.csda.2012.12.008>. <hal-00750909>

Partager

Métriques

Consultations de
la notice

892

Téléchargements du document

2541