Mining textual data through term variant clustering : the TermWatch system. - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2004

Mining textual data through term variant clustering : the TermWatch system.

Résumé

We present a system for mapping the structure of research topics in a corpus. TermWatch portrays the "aboutness" of a corpus of scientific and technical publications by bridging the gap between pure statistical approaches and symbolic techniques. In the present paper, an experiment on unsupervised textmining is performed on a corpus of scientific titles and abstracts from 16 prominent IR journals. The preliminary results showed that TermWatch was able to capture low occurring phenomena which the usual clustering methods based on co-occurrence may not highlight. The results also reflect the expressive power of terminological variations as a means to capture the structure of research topics contained in a corpus.
Fichier principal
Vignette du fichier
riao-04-ibesan.pdf (605.94 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00636163 , version 1 (26-10-2011)

Identifiants

  • HAL Id : hal-00636163 , version 1

Citer

Fidelia Ibekwe-Sanjuan, Eric Sanjuan. Mining textual data through term variant clustering : the TermWatch system.. Recherche d'Information et ses Applications (RIAO 2004), Apr 2004, Avignon, France. pp.487-503. ⟨hal-00636163⟩
265 Consultations
302 Téléchargements

Partager

Gmail Facebook X LinkedIn More