Model-based clustering of high-dimensional data in Astrophysics

Abstract : The nature of data in Astrophysics has changed, as in other scientific fields, in the past decades due to the increase of the measurement capabilities. As a consequence, data are nowadays frequently of high dimensionality and available in mass or stream. Model-based techniques for clustering are popular tools which are renowned for their probabilistic foundations and their flexibility. However, classical model-based techniques show a disappointing behavior in high-dimensional spaces which is mainly due to their dramatical over-parametrization. The recent developments in model-based classification overcome these drawbacks and allow to efficiently classify high-dimensional data, even in the " small n / large p " situation. This work presents a comprehensive review of these recent approaches, including regularization-based techniques, parsimonious modeling, subspace classification methods and classification methods based on variable selection. The use of these model-based methods is also illustrated on real-world classification problems in Astrophysics using R packages.
Document type :
Book sections
Complete list of metadatas

Cited literature [49 references]  Display  Hide  Download
Contributor : Charles Bouveyron <>
Submitted on : Tuesday, February 9, 2016 - 11:34:45 AM
Last modification on : Friday, September 20, 2019 - 4:34:03 PM
Long-term archiving on: Saturday, November 12, 2016 - 3:06:44 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License


  • HAL Id : hal-01264844, version 2



Charles Bouveyron. Model-based clustering of high-dimensional data in Astrophysics. Statistics for Astrophysics: Clustering and Classification, EAS Publications Series, 77, EDP Sciences, pp.91-119, 2016. ⟨hal-01264844v2⟩



Record views


Files downloads