A Rank-Based Similarity Metric for Word Embeddings

Enrico Santus; Hongmin Wang; Emmanuele Chersoni; Yue Zhang

Communication Dans Un Congrès Année : 2018

A Rank-Based Similarity Metric for Word Embeddings

(1, 2) , (2, 3) , (4, 5) , (2)

1
2
3
4
5

Enrico Santus

Fonction : Auteur
PersonId : 999665

Massachusetts Institute of Technology

Singapore Institute of Technology [Singapore]

Hongmin Wang

Fonction : Auteur

Singapore Institute of Technology [Singapore]

University of California [Santa Barbara]

Emmanuele Chersoni

Fonction : Auteur
PersonId : 999663

Laboratoire Parole et Langage

Aix Marseille Université

Yue Zhang

Fonction : Auteur

Singapore Institute of Technology [Singapore]

Résumé

Word Embeddings (WE) have recently imposed themselves as a standard for representing word meaning in NLP. Semantic similarity between word pairs has become the most common evaluation benchmark for these representations, with vector cosine being typically used as the only similarity metric. In this paper, we report experiments with a rank-based metric for WE, which performs comparably to vector cosine in similarity estimation and out-performs it in the recently-introduced and challenging task of outlier detection, thus suggesting that rank-based measures can improve clustering quality.

Domaines

Informatique [cs] Informatique et langage [cs.CL]

Fichier principal

rank-based-similarity.pdf (210.47 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Emmanuele Chersoni : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01838253

Soumis le : vendredi 13 juillet 2018-11:09:06

Dernière modification le : vendredi 24 mars 2023-14:53:07

Archivage à long terme le : lundi 15 octobre 2018-18:52:06

Dates et versions

hal-01838253 , version 1 (13-07-2018)

Identifiants

HAL Id : hal-01838253 , version 1

Citer

Enrico Santus, Hongmin Wang, Emmanuele Chersoni, Yue Zhang. A Rank-Based Similarity Metric for Word Embeddings. 56th annual meeting of the Association for Computational Linguistics (ACL), Jul 2018, Melbourne, Australia. ⟨hal-01838253⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-AMU LPL-AIX ILCB ANR

64 Consultations

102 Téléchargements

A Rank-Based Similarity Metric for Word Embeddings

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager