A new efficient and unbiased approach for clustering quality evaluation

Abstract : Traditional quality indexes (Inertia, DB, . . . ) are known to be method-dependent indexes that do not allow to properly estimate the quality of the clustering in several cases, as in that one of complex data, like textual data. We thus propose an alternative approach for clustering quality evaluation based on unsupervised measures of Recall, Precision and F-measure exploiting the descriptors of the data associated with the obtained clusters. Two categories of index are proposed, that are Macro and Micro indexes. This paper also focuses on the construction of a new cumulative Micro precision index that makes it possible to evalu- ate the overall quality of a clustering result while clearly distinguishing between homogeneous and heterogeneous, or degenerated results. The experimental comparison of the behavior of the classical indexes with our new approach is performed on a polythematic dataset of bibliographical references issued from the PASCAL database.
Liste complète des métadonnées

Cited literature [19 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00955498
Contributor : Patricia Gautier <>
Submitted on : Tuesday, March 4, 2014 - 3:38:46 PM
Last modification on : Tuesday, December 18, 2018 - 4:38:01 PM
Document(s) archivé(s) le : Wednesday, June 4, 2014 - 11:46:41 AM

File

qimie2011_submission_10.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00955498, version 1

Collections

Citation

Jean-Charles Lamirel, Pascal Cuxac, Raghvendra Mall. A new efficient and unbiased approach for clustering quality evaluation. QIMIE'11, May 2011, Shenzen, China. pp.209-220. ⟨hal-00955498⟩

Share

Metrics

Record views

402

Files downloads

263