Model-Based Clustering of High-Dimensional Data: A review

Charles Bouveyron; Camille Brunet

doi:10.1016/j.csda.2012.12.008

Article Dans Une Revue Computational Statistics and Data Analysis Année : 2013

Model-Based Clustering of High-Dimensional Data: A review

(1) , (2)

1
2

Charles Bouveyron

Fonction : Auteur
PersonId : 347
IdHAL : charles-bouveyron
ORCID : 0000-0002-6956-4491
IdRef : 112244785

Mathématiques Appliquées Paris 5

Camille Brunet

Fonction : Auteur
PersonId : 859585

Laboratoire Angevin de Recherche en Mathématiques

Résumé

Model-based clustering is a popular tool which is renowned for its probabilistic foundations and its flexibility. However, high-dimensional data are nowadays more and more frequent and, unfortunately, classical model-based clustering techniques show a disappointing behavior in high-dimensional spaces. This is mainly due to the fact that model-based clustering methods are dramatically over-parametrized in this case. However, high-dimensional spaces have specific characteristics which are useful for clustering and recent techniques exploit those characteristics. After having recalled the bases of model-based clustering, this article will review dimension reduction approaches, regularization-based techniques, parsimonious modeling, subspace clustering methods and clustering methods based on variable selection. Existing softwares for model-based clustering of high-dimensional data will be also reviewed and their practical use will be illustrated on real-world data sets.

Domaines

Statistiques [math.ST] Théorie [stat.TH]

Fichier principal

hal_ReviewHD.pdf (559.34 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Charles Bouveyron : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00750909

Soumis le : lundi 12 novembre 2012-16:31:30

Dernière modification le : jeudi 11 avril 2024-13:16:13

Archivage à long terme le : mercredi 13 février 2013-03:46:26

Dates et versions

hal-00750909 , version 1 (12-11-2012)

Licence

Paternité

Identifiants

HAL Id : hal-00750909 , version 1
DOI : 10.1016/j.csda.2012.12.008

Citer

Charles Bouveyron, Camille Brunet. Model-Based Clustering of High-Dimensional Data: A review. Computational Statistics and Data Analysis, 2013, 71, pp.52-78. ⟨10.1016/j.csda.2012.12.008⟩. ⟨hal-00750909⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-ANGERS LAREMA MAP5 UP-SCIENCES

876 Consultations

6544 Téléchargements

Model-Based Clustering of High-Dimensional Data: A review

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager