HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

Indiscriminateness in representation spaces of terms and documents

Vincent Claveau 1
1 LinkMedia - Creating and exploiting explicit links between multimedia fragments
Inria Rennes – Bretagne Atlantique , IRISA-D6 - MEDIA ET INTERACTIONS
Abstract : Examining the properties of representation spaces for documents or words in Information Retrieval (IR) – typically R n with n large – brings precious insights to help the retrieval process. Recently, several authors have studied the real dimensionality of the datasets, called intrinsic dimensionality, in specific parts of these spaces [14]. They have shown that this dimensionality is chiefly tied with the notion of in-discriminateness among neighbors of a query point in the vector space. In this paper, we propose to revisit this notion in the specific case of IR. More precisely, we show how to estimate indiscriminateness from IR similarities in order to use it in representation spaces used for documents and words [18, 7]. We show that indiscriminateness may be used to characterize difficult queries; moreover we show that this notion, applied to word embeddings, can help to choose terms to use for query expansion.
Complete list of metadata

Cited literature [28 references]  Display  Hide  Download

Contributor : Vincent Claveau Connect in order to contact the contributor
Submitted on : Wednesday, August 22, 2018 - 11:58:18 AM
Last modification on : Friday, April 8, 2022 - 4:08:03 PM
Long-term archiving on: : Friday, November 23, 2018 - 2:46:34 PM


Files produced by the author(s)



Vincent Claveau. Indiscriminateness in representation spaces of terms and documents. ECIR 2018 - 40th European Conference in Information Retrieval, Mar 2018, Grenoble, France. pp.251-262, ⟨10.1007/978-3-319-76941-7_19⟩. ⟨hal-01859568⟩



Record views


Files downloads