A cross-comparison of two clustering methods - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2001

A cross-comparison of two clustering methods

Résumé

Many Natural Language Processing applications require semantic knowledge about topics in order to be possible or to be efficient. So we developed a system, SEGAPSITH, that acquires it automatically from text segments by using an unsupervised and incremental clustering method. In such an approach, an important problem consists of the validation of the learned classes. To do that, we applied another clustering method,that only needs to know the number of classes to build, on the same subset of text segments and we reformulate our evaluation problem in comparing the two classifications. So, we established different criteria to compare them, based either on the words as class descriptors or on the thematic units. Our first results lead to show a great correlation between the two classifications.
Fichier principal
Vignette du fichier
W01-0909.pdf (67.49 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02458023 , version 1 (28-01-2020)

Identifiants

  • HAL Id : hal-02458023 , version 1

Citer

Olivier Ferret, Brigitte Grau, Michèle Jardino. A cross-comparison of two clustering methods. Proceedings of the workshop on Evaluation for Language and Dialogue Systems-Volume 9, 2001, Toulouse, France. pp.9. ⟨hal-02458023⟩
17 Consultations
34 Téléchargements

Partager

Gmail Facebook X LinkedIn More