A Quantitative Evaluation of Global Word Sense Induction

Abstract : Word sense induction (WSI) is the task aimed at automatically identifying the senses of words in texts, without the need for handcrafted resources or annotated data. Up till now, most WSI algorithms extract the different senses of a word 'locally' on a per-word basis, i.e. the different senses for each word are determined separately. In this paper, we compare the performance of such algorithms to an algorithm that uses a 'global' approach, i.e. the different senses of a particular word are determined by comparing them to, and demarcating them from, the senses of other words in a full-blown word space model. We adopt the evaluation framework proposed in the SemEval-2010 Word Sense Induction \& Disambiguation task. All systems that participated in this task use a local scheme for determining the different senses of a word. We compare their results to the ones obtained by the global approach, and discuss the advantages and weaknesses of both approaches.
Type de document :
Communication dans un congrès
CICLing'11 - 12th International Conference on Intelligent Text Processing and Computational Linguistics, Feb 2011, Tokyo, Japan. Springer, 6608, pp.253--264, 2011, 〈10.1007/978-3-642-19400-9_20〉
Liste complète des métadonnées

Littérature citée [25 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-00607673
Contributeur : Marianna Apidianaki <>
Soumis le : dimanche 10 juillet 2011 - 18:17:52
Dernière modification le : vendredi 4 janvier 2019 - 17:33:24
Document(s) archivé(s) le : mardi 11 octobre 2011 - 02:21:02

Fichier

camera-ready_Apidianaki_VandeC...
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Marianna Apidianaki, Tim Van de Cruys. A Quantitative Evaluation of Global Word Sense Induction. CICLing'11 - 12th International Conference on Intelligent Text Processing and Computational Linguistics, Feb 2011, Tokyo, Japan. Springer, 6608, pp.253--264, 2011, 〈10.1007/978-3-642-19400-9_20〉. 〈hal-00607673〉

Partager

Métriques

Consultations de la notice

607

Téléchargements de fichiers

131