Word Sense Induction with Attentive Context Clustering - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of Data Mining and Digital Humanities Année : 2022

Word Sense Induction with Attentive Context Clustering

Moshe Stekel
  • Fonction : Auteur
  • PersonId : 1126667
Amos Azaria
  • Fonction : Auteur
Shai Gordin
  • Fonction : Auteur

Résumé

This paper presents ACCWSI (Attentive Context Clustering WSI), a method for Word Sense Induction, suitable for languages with limited resources. Pretrained on a small corpus and given an ambiguous word (a query word) and a set of excerpts that contain it, ACCWSI uses an attention mechanism for generating context-aware embeddings, distinguishing between the different senses assigned to the query word. These embeddings are then clustered to provide groups of main common uses of the query word. We show that ACCWSI performs well on the SemEval-2 2010 WSI task. ACCWSI also demonstrates practical applicability for shedding light on the meanings of ambiguous words in ancient languages, such as Classical Hebrew and Akkadian. In the near future, we intend to turn ACCWSI into a practical tool for linguists and historians.

Mots clés

Fichier principal
Vignette du fichier
accwsi.pdf (691.4 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03586559 , version 1 (24-02-2022)
hal-03586559 , version 2 (03-03-2022)
hal-03586559 , version 3 (14-04-2022)
hal-03586559 , version 4 (06-06-2022)

Identifiants

Citer

Moshe Stekel, Amos Azaria, Shai Gordin. Word Sense Induction with Attentive Context Clustering. Journal of Data Mining and Digital Humanities, 2022, NLP4DH, ⟨10.46298/jdmdh.9175⟩. ⟨hal-03586559v4⟩
175 Consultations
492 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More