Skip to Main content Skip to Navigation
Journal articles

Model-Based Clustering of High-Dimensional Data: A review

Abstract : Model-based clustering is a popular tool which is renowned for its probabilistic foundations and its flexibility. However, high-dimensional data are nowadays more and more frequent and, unfortunately, classical model-based clustering techniques show a disappointing behavior in high-dimensional spaces. This is mainly due to the fact that model-based clustering methods are dramatically over-parametrized in this case. However, high-dimensional spaces have specific characteristics which are useful for clustering and recent techniques exploit those characteristics. After having recalled the bases of model-based clustering, this article will review dimension reduction approaches, regularization-based techniques, parsimonious modeling, subspace clustering methods and clustering methods based on variable selection. Existing softwares for model-based clustering of high-dimensional data will be also reviewed and their practical use will be illustrated on real-world data sets.
Complete list of metadata

Cited literature [98 references]  Display  Hide  Download
Contributor : Charles Bouveyron Connect in order to contact the contributor
Submitted on : Monday, November 12, 2012 - 4:31:30 PM
Last modification on : Wednesday, October 27, 2021 - 3:02:36 PM
Long-term archiving on: : Wednesday, February 13, 2013 - 3:46:26 AM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



Charles Bouveyron, Camille Brunet. Model-Based Clustering of High-Dimensional Data: A review. Computational Statistics and Data Analysis, Elsevier, 2013, 71, pp.52-78. ⟨10.1016/j.csda.2012.12.008⟩. ⟨hal-00750909⟩



Record views


Files downloads