Enhancing Translation Language Models with Word Embedding for Information Retrieval

In this paper, we explore the usage of Word Embedding semantic resources for Information Retrieval (IR) task. This embedding, produced by a shallow neural network, have been shown to catch semantic similarities between words (Mikolov et al., 2013). Hence, our goal is to enhance IR Language Models by addressing the term mismatch problem. To do so, we applied the model presented in the paper Integrating and Evaluating Neural Word Embedding in Information Retrieval by Zuccon et al. (2015) that proposes to estimate the translation probability of a Translation Language Model using the cosine similarity between Word Embedding. The results we obtained so far did not show a statistically significant improvement compared to classical Language Model.

Mots clés

Word Embedding Information Retrieval Language Model

Domaines

Recherche d'information [cs.IR]

Fichier principal

Enhancing Translation Language Models with Word Embedding for Information Retrieval.pdf (186.87 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Jibril FREJ : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01681311

Soumis le : jeudi 11 janvier 2018-15:14:23

Dernière modification le : jeudi 4 avril 2024-21:18:39

Archivage à long terme le : jeudi 3 mai 2018-16:05:35

Dates et versions

hal-01681311 , version 1 (11-01-2018)

Identifiants

HAL Id : hal-01681311 , version 1
ARXIV : 1801.03844

Citer

Jibril Frej, Jean-Pierre Chevallet, Didier Schwab. Enhancing Translation Language Models with Word Embedding for Information Retrieval. 9ème Atelier Recherche d'Information SEmantique, Jul 2017, Caen, France. ⟨hal-01681311⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS LIG LIG_TDCGE_GETALP LIG_SIDCH

110 Consultations

141 Téléchargements