Boosting Tricks for Word Mover's Distance - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Boosting Tricks for Word Mover's Distance

Résumé

Word embeddings have opened a new path in creating novel approaches for addressing traditional problems in the natural language processing (NLP) domain. However, using word embeddings to compare text documents remains a relatively unexplored topic-with Word Mover's Distance (WMD) being the prominent tool used so far. In this paper, we present a variety of tools that can further improve the computation of distances between documents based on WMD. We demonstrate that, alternative stopwords, cross document-topic comparison, deep contextualized word vectors and convex metric learning, constitute powerful tools that can boost WMD.
Fichier principal
Vignette du fichier
boosting-WMD-ICANN-2020.pdf (279.2 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03088924 , version 1 (27-12-2020)

Identifiants

  • HAL Id : hal-03088924 , version 1

Citer

Konstantinos Skianis, Fragkiskos D. Malliaros, Nikolaos Tziortziotis, Michalis Vazirgiannis. Boosting Tricks for Word Mover's Distance. ICANN 2020 - 29th International Conference on Artificial Neural Networks, Sep 2020, Bratislava, Slovakia. ⟨hal-03088924⟩
74 Consultations
471 Téléchargements

Partager

Gmail Facebook X LinkedIn More