Skip to Main content Skip to Navigation
Conference papers

A weakly-supervised detection of entity central documents in a stream

Abstract : Filtering a time-ordered corpus for documents that are highly relevant to an entity is a task receiving more and more attention over the years. One application is to reduce the delay between the moment an information about an entity is being first observed and the moment the entity entry in a knowledge base is being updated. Current state-of-the-art approaches are highly supervised and require training examples for each entity monitored. We propose an approach which does not require new training data when processing a new entity. To capture intrinsic characteristics of highly relevant documents our approach relies on three types of features: document centric features, entity profile related features and time features. Evaluated within the framework of the " Knowledge Base Acceleration " track at TREC 2012, it outperforms current state-of-the-art approaches.
Document type :
Conference papers
Complete list of metadata
Contributor : Bibliothèque Universitaire Déposants Hal-Avignon <>
Submitted on : Monday, May 9, 2016 - 2:24:11 PM
Last modification on : Monday, March 30, 2020 - 8:41:19 AM



Ludovic Bonnefoy, Vincent Bouvier, Patrice Bellot. A weakly-supervised detection of entity central documents in a stream. Sigir 2013, Jul 2013, Dublin, Ireland. ⟨10.1145/2484028.2484180⟩. ⟨hal-01313021⟩



Record views