Best-Effort Refresh Strategies for Content-Based RSS Feed Aggregation

Roxana Horincar 1 Bernd Amann 1 Thierry Artières 2
1 BD - Bases de Données
LIP6 - Laboratoire d'Informatique de Paris 6
2 MALIRE - Machine Learning and Information Retrieval
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : During the past several years RSS-based content syndication has become a standard technique for efficiently and timely disseminating information on the web. From a data processing perspective RSS feeds are standard XML resources which are periodically refreshed by feed aggregators for generating continuous streams of items. In this article, we study the problem of information loss in the context of a content-based feed aggregation system and we propose a new best-effort refresh strategy for RSS feeds under limited bandwidth. This strategy is evaluated experimentally and compared to other state-of-the-art crawling strategies for web pages.
Document type :
Conference papers
Complete list of metadatas

https://hal.archives-ouvertes.fr/hal-01292109
Contributor : Lip6 Publications <>
Submitted on : Tuesday, March 22, 2016 - 3:32:53 PM
Last modification on : Friday, March 22, 2019 - 1:39:04 AM

Links full text

Identifiers

Citation

Roxana Horincar, Bernd Amann, Thierry Artières. Best-Effort Refresh Strategies for Content-Based RSS Feed Aggregation. The 11th international conference on Web information systems engineering (WISE 2010), Dec 2010, Hong Kong, China. pp.262-270, ⟨10.1007/978-3-642-17616-6_24⟩. ⟨hal-01292109⟩

Share

Metrics

Record views

94