Skip to Main content Skip to Navigation
Conference papers

Is document frequency important for PRF?

Abstract : We introduce in this paper a new heuristic constraint for PRF models, referred to as the Document Frequency (DF) constraint, which is validated through a series of experiments with an oracle. We then analyze, from a theoretical point of view, state-of-the-art PRF models according to their relation with this constraint. This analysis reveals that the standard mixture model for PRF in the language modeling family does not satisfy the DF constraint on the contrary to several recently proposed models. Lastly, we perform tests, which further validate the constraint, with a simple family of tf-idf functions based on a parameter controlling the satisfaction of the DF constraint.
Complete list of metadatas

Cited literature [15 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00742242
Contributor : Eric Gaussier <>
Submitted on : Tuesday, October 16, 2012 - 11:46:41 AM
Last modification on : Monday, April 20, 2020 - 11:24:01 AM
Document(s) archivé(s) le : Thursday, January 17, 2013 - 11:35:16 AM

File

Clinchant-ictir2011.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00742242, version 1

Collections

Citation

Stéphane Clinchant, Éric Gaussier. Is document frequency important for PRF?. ICTIR 2011 - International Conference on the Theory Information Retrieval, Sep 2011, Bertinoro, Italy. pp.89-100. ⟨hal-00742242⟩

Share

Metrics

Record views

208

Files downloads

225