On metric embedding for boosting semantic similarity computations - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

On metric embedding for boosting semantic similarity computations

Julien Subercaze
Christophe Gravier
Frédérique Laforest

Résumé

Computing pairwise word semantic similarity is widely used and serves as a building block in many tasks in NLP. In this paper, we explore the embedding of the shortest-path metrics from a knowledge base (Wordnet) into the Hamming hyper-cube, in order to enhance the computation performance. We show that, although an isometric embedding is untractable, it is possible to achieve good non-isometric embeddings. We report a speedup of three orders of magnitude for the task of computing Leacock and Chodorow (LCH) similarity while keeping strong correlations (r = .819, ρ = .826).
Fichier principal
Vignette du fichier
aclfinal.pdf (798.98 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01166163 , version 1 (22-06-2015)

Identifiants

  • HAL Id : hal-01166163 , version 1

Citer

Julien Subercaze, Christophe Gravier, Frédérique Laforest. On metric embedding for boosting semantic similarity computations. Association of Computational Linguistics, Jul 2015, Beijing, China. ⟨hal-01166163⟩
227 Consultations
573 Téléchargements

Partager

Gmail Facebook X LinkedIn More