Model-based clustering of high-dimensional data in Astrophysics

Abstract : The nature of data in Astrophysics has changed, as in other scientific fields, in the past decades due to the increase of the measurement capabilities. As a consequence, data are nowadays frequently of high dimensionality and available in mass or stream. Model-based techniques for clustering are popular tools which are renowned for their probabilistic foundations and their flexibility. However, classical model-based techniques show a disappointing behavior in high-dimensional spaces which is mainly due to their dramatical over-parametrization. The recent developments in model-based classification overcome these drawbacks and allow to efficiently classify high-dimensional data, even in the " small n / large p " situation. This work presents a comprehensive review of these recent approaches, including regularization-based techniques, parsimonious modeling, subspace classification methods and classification methods based on variable selection. The use of these model-based methods is also illustrated on real-world classification problems in Astrophysics using R packages.
Type de document :
Chapitre d'ouvrage
Statistics for Astrophysics: Clustering and Classification, EAS Publications Series, 77, EDP Sciences, pp.91-119, 2016
Liste complète des métadonnées

Littérature citée [49 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01264844
Contributeur : Charles Bouveyron <>
Soumis le : mardi 9 février 2016 - 11:34:45
Dernière modification le : mardi 10 octobre 2017 - 11:22:04
Document(s) archivé(s) le : samedi 12 novembre 2016 - 15:06:44

Fichier

chapitreAstro.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

  • HAL Id : hal-01264844, version 2

Collections

Citation

Charles Bouveyron. Model-based clustering of high-dimensional data in Astrophysics. Statistics for Astrophysics: Clustering and Classification, EAS Publications Series, 77, EDP Sciences, pp.91-119, 2016. 〈hal-01264844v2〉

Partager

Métriques

Consultations de la notice

207

Téléchargements de fichiers

392