New efficient clustering quality indexes

Abstract : This paper deals with a major challenge in clustering that is optimal model selection. It presents new efficient clustering quality indexes relying on feature maximization, which is an alternative measure to usual distributional measures relying on entropy, Chi-square metric or vector-based measures such as Euclidean distance or correlation distance. First Experiments compare the behavior of these new indexes with usual cluster quality indexes based on Euclidean distance on different kinds of test datasets for which ground truth is available. This comparison clearly highlights altogether the superior accuracy and stability of the new method on these datasets, its efficiency from low to high dimensional range and its tolerance to noise. Further experiments are then conducted on " real life " textual data extracted from a multisource bibliographic database for which ground truth is unknown. These experiments show that the accuracy and stability of these new indexes allow to deal efficiently with diachronic analysis, when other indexes do not fit the requirements for this task.
Type de document :
Communication dans un congrès
International Joint Conference on Neural Networks (IJCNN 2016), Jul 2016, Vancouver, Canada
Liste complète des métadonnées

Littérature citée [32 références]  Voir  Masquer  Télécharger
Contributeur : Nicolas Dugue <>
Soumis le : samedi 30 juillet 2016 - 01:39:44
Dernière modification le : mardi 18 décembre 2018 - 16:38:02
Document(s) archivé(s) le : lundi 31 octobre 2016 - 10:20:16


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-01350509, version 1



Jean-Charles Lamirel, Nicolas Dugué, Pascal Cuxac. New efficient clustering quality indexes. International Joint Conference on Neural Networks (IJCNN 2016), Jul 2016, Vancouver, Canada. 〈hal-01350509〉



Consultations de la notice


Téléchargements de fichiers