Enhancing Information Retrieval Through Concept-Based Language Modeling and Semantic Smoothing. - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Journal of the Association for Information Science and Technology Année : 2015

Enhancing Information Retrieval Through Concept-Based Language Modeling and Semantic Smoothing.

Résumé

Traditionally, many information retrieval models assume that terms occur in documents independently. Although these models have already shown good performance, the word independency assumption seems to be unrealistic from a natural language point of view, which considers that terms are related to each other. Therefore, such an assumption leads to two well‐known problems in information retrieval (IR), namely, polysemy, or term mismatch, and synonymy. In language models, these issues have been addressed by considering dependencies such as bigrams, phrasal‐concepts, or word relationships, but such models are estimated using simple n‐grams or concept counting. In this paper, we address polysemy and synonymy mismatch with a concept‐based language modeling approach that combines ontological concepts from external resources with frequently found collocations from the document collection. In addition, the concept‐based model is enriched with subconcepts and semantic relationships through a semantic smoothing technique so as to perform semantic matching. Experiments carried out on TREC collections show that our model achieves significant improvements over a single word‐based model and the Markov Random Field model (using a Markov classifier).

Mots clés

Fichier principal
Vignette du fichier
saidlhadj_22015.pdf (795.86 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03516901 , version 1 (07-01-2022)

Identifiants

Citer

Lynda Said Lhadj, Mohand Boughanem, Karima Amrouche. Enhancing Information Retrieval Through Concept-Based Language Modeling and Semantic Smoothing.. Journal of the Association for Information Science and Technology, 2015, 67 (12), pp.2909-2927. ⟨10.1002/asi.23553⟩. ⟨hal-03516901⟩
13 Consultations
12 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More