Evaluation of unsupervised information extraction - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2012

Evaluation of unsupervised information extraction

W Wang
  • Fonction : Auteur
Romaric Besançon
Olivier Ferret

Résumé

Unsupervised methods gain more and more attention nowadays in information extraction area, which allows to design more open extraction systems. In the domain of unsupervised information extraction, clustering methods are of particular importance. However, evaluating the results of clustering remains difficult at a large scale, especially in the absence of a reliable reference. On the basis of our experiments on unsupervised relation extraction, we first discuss in this article how to evaluate clustering quality without a reference by relying on internal measures. Then we propose a method, supported by a dedicated annotation tool, for building a set of reference clusters of relations from a corpus. Moreover, we apply it to our experimental framework and illustrate in this way how to build a significant reference for unsupervised relation extraction, more precisely made of 80 clusters gathering more than 4,000 relation instances, in a short time. Finally, we present how such reference is exploited for the evaluation of clustering with external measures and analyze the results of the application of these measures to the clusters of relations produced by our unsupervised relation extraction system.
Fichier principal
Vignette du fichier
lrec-2012-uie-sent.pdf (455.52 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02282037 , version 1 (09-09-2019)

Identifiants

  • HAL Id : hal-02282037 , version 1

Citer

W Wang, Romaric Besançon, Olivier Ferret, Brigitte Grau. Evaluation of unsupervised information extraction. International Conference on Language Resources and Evaluation, Jan 2012, Istanbul, Turkey. ⟨hal-02282037⟩
74 Consultations
49 Téléchargements

Partager

Gmail Facebook X LinkedIn More