Knowledge discovery with CRF-based clustering of named entities without a priori classes - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Lecture Notes in Computer Science Année : 2014

Knowledge discovery with CRF-based clustering of named entities without a priori classes

Vincent Claveau
Abir Ncibi
  • Fonction : Auteur
  • PersonId : 949220

Résumé

Knowledge discovery aims at bringing out coherent groups of entities. It is usually based on clustering which necessitates defining a notion of similarity between the relevant entities. In this paper, we propose to divert a supervised machine learning technique (namely Conditional Random Fields, widely used for supervised labeling tasks) in order to calculate, indirectly and without supervision, similarities among text sequences. Our approach consists in generating artificial labeling problems on the data to reveal regularities between entities through their labeling. We describe how this framework can be implemented and experiment it on two information extraction/discovery tasks. The results demonstrate the usefulness of this unsupervised approach, and open many avenues for defining similarities for complex representations of textual data.
Fichier principal
Vignette du fichier
Claveau_CICling14.pdf (546.08 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01027520 , version 1 (22-07-2014)

Identifiants

  • HAL Id : hal-01027520 , version 1

Citer

Vincent Claveau, Abir Ncibi. Knowledge discovery with CRF-based clustering of named entities without a priori classes. Conference on Intelligent Text Processing and Computational Linguistics CICLing, Apr 2014, Kathmandu, Nepal. pp.415-428. ⟨hal-01027520⟩
205 Consultations
409 Téléchargements

Partager

Gmail Facebook X LinkedIn More