Combining Subword information and Language model for Information Retrieval

Jibril Frej 1, 2, 3, 4 Philippe Mulhem 1, 2, 3 Didier Schwab 1, 2, 4 Jean-Pierre Chevallet 1, 2, 3
Abstract : InformationRetrieval(IR)classicallyreliesonseveralprocessestoimproveperfor- mance of language modeling approaches. When considering semantic of words, Neural Word Embeddings (Mikolov et al., 2013) have been shown to catch semantic similarities between words. Such Distributed Representations represent terms in a dense vector space are efficiently learned from large corpora. Lately, they have been used to compute the translation probabilities between terms in the Neural Translation Language Model (NTLM) (Zuccon et al., 2015) frame- work for Information Retrieval in order to deal with the vocabulary mismatch issue. In this work, we propose to test this model with recent vectorial representations (Bojanowski et al., 2016) that take into account the internal structure of words.
Liste complète des métadonnées

Cited literature [10 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01781181
Contributor : Didier Schwab <>
Submitted on : Sunday, April 29, 2018 - 5:07:30 PM
Last modification on : Tuesday, February 12, 2019 - 1:30:55 AM
Document(s) archivé(s) le : Thursday, September 20, 2018 - 4:36:49 AM

File

CORIA2018_Frej-et-al.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01781181, version 1

Citation

Jibril Frej, Philippe Mulhem, Didier Schwab, Jean-Pierre Chevallet. Combining Subword information and Language model for Information Retrieval. 15e Conférence en Recherche d’Information et Applications, May 2018, Rennes, France. ⟨hal-01781181⟩

Share

Metrics

Record views

221

Files downloads

150