Combining Subword information and Language model for Information Retrieval

Jibril Frej 1, 2, 3, 4 Philippe Mulhem 1, 2, 3 Didier Schwab 1, 2, 4 Jean-Pierre Chevallet 1, 2, 3
Abstract : InformationRetrieval(IR)classicallyreliesonseveralprocessestoimproveperfor- mance of language modeling approaches. When considering semantic of words, Neural Word Embeddings (Mikolov et al., 2013) have been shown to catch semantic similarities between words. Such Distributed Representations represent terms in a dense vector space are efficiently learned from large corpora. Lately, they have been used to compute the translation probabilities between terms in the Neural Translation Language Model (NTLM) (Zuccon et al., 2015) frame- work for Information Retrieval in order to deal with the vocabulary mismatch issue. In this work, we propose to test this model with recent vectorial representations (Bojanowski et al., 2016) that take into account the internal structure of words.
Liste complète des métadonnées

Cited literature [10 references]  Display  Hide  Download
Contributor : Didier Schwab <>
Submitted on : Sunday, April 29, 2018 - 5:07:30 PM
Last modification on : Tuesday, February 12, 2019 - 1:30:55 AM
Document(s) archivé(s) le : Thursday, September 20, 2018 - 4:36:49 AM


Files produced by the author(s)


  • HAL Id : hal-01781181, version 1


Jibril Frej, Philippe Mulhem, Didier Schwab, Jean-Pierre Chevallet. Combining Subword information and Language model for Information Retrieval. 15e Conférence en Recherche d’Information et Applications, May 2018, Rennes, France. ⟨hal-01781181⟩



Record views


Files downloads