Word sense discrimination in information retrieval: a spectral clustering-based approach

Abstract : Word sense ambiguity has been identified as a cause of poor precision in information retrieval (IR) systems. Word sense disambiguation and discrimination methods have been defined to help systems choose which documents should be retrieved in relation to an ambiguous query. However, the only approaches that show a genuine benefit for word sense discrimination or disambiguation in IR are generally supervised ones. In this paper we propose a new unsupervised method that uses word sense discrimination in IR. The method we develop is based on spectral clustering and reorders an initially retrieved document list by boosting documents that are semantically similar to the target query. For several TREC ad hoc collections we show that our method is useful in the case of queries which contain ambiguous terms. We are interested in improving the level of precision after 5, 10 and 30 retrieved documents (P@5, P@10, P@30) respectively. We show that precision can be improved by 8% above current state-of-the-art baselines. We also focus on poor performing queries.
Complete list of metadatas

Cited literature [54 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01153775
Contributor : Open Archive Toulouse Archive Ouverte (oatao) <>
Submitted on : Wednesday, May 20, 2015 - 2:40:24 PM
Last modification on : Friday, October 11, 2019 - 8:22:49 PM
Long-term archiving on : Tuesday, September 15, 2015 - 6:28:50 AM

File

chifu_13247.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Adrian-Gabriel Chifu, Florentina Hristea, Josiane Mothe, Marius Popescu. Word sense discrimination in information retrieval: a spectral clustering-based approach. Information Processing and Management, Elsevier, 2014, vol. 51 (n° 2), pp. 16-31. ⟨10.1016/j.ipm.2014.10.007⟩. ⟨hal-01153775⟩

Share

Metrics

Record views

162

Files downloads

494