Evaluating clustering quality using features salience: a promising approach - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Neural Computing and Applications Année : 2021

Evaluating clustering quality using features salience: a promising approach

Résumé

This paper focuses on using feature salience to evaluate the quality of a partition when dealing with hard clustering. It is based on the hypothesis that a good partition is an easy to label partition, i.e. a partition for which each cluster is made of salient features. This approach is mostly compared to usual approaches relying on distances between data, but also to more recent approaches based on entropy or stability. We show that our feature-based approach outperforms the compared indexes for optimal model selection: they are more efficient from low- to high-dimensional range as well as they are more robust to noise. To show the efficiency of our indexes on a real-life application, we consider the task of diachronic analysis on a textual dataset. We demonstrate that our approach allows to get some interesting and relevant results in that context, while other approaches mostly lead to unusable results.
Fichier non déposé

Dates et versions

hal-03714726 , version 1 (05-07-2022)

Identifiants

Citer

Nicolas Dugué, Jean-Charles Lamirel, Yue Chen. Evaluating clustering quality using features salience: a promising approach. Neural Computing and Applications, 2021, 33 (19), pp.12939-12956. ⟨10.1007/s00521-021-05942-7⟩. ⟨hal-03714726⟩
62 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More