Semantic-aware news feeds management framework

Abstract : In the Web, RSS and Atom (feeds) are probably the most popular and highly utilized XML formats which allow web communities, publishing industries, web services, etc. to publish and exchange XML documents. In addition, they allow a user to consume data/information easily without roaming from site to site using software applications. Here, the user registers her favorite feed providers; and each provider sends the list of news items changed since the last download. However, registering a number of feed sources in feed aggregators cause both heterogeneity and information overloading problems. Besides, none of the existing RSS/feed aggregators provide an approach that integrates (merges) feeds from different sources considering similarity, user contexts and preferences. In this research, we provide a formal framework that handles the heterogeneity, integration and querying feeds. The framework is based a tree representation of a feed and has three main components: feed comparator, merger and query processor. The feed comparator addresses the issue of measuring the relatedness between news items using a Knowledge Base, a bottom-up and incremental approaches. We proposed a concept-based similarity measure based on the function of the number of shared and different concepts in their global semantic neighborhoods. Here, we use the concept similarity value and relationship as a building block for texts, simple elements and items relatedness algorithms. We show also how to define and identify the exclusive relationship between any two texts and elements. The feed merger addresses the issue of integrating news items from different sources considering a user context. We show here how to represent a user context and her preferences. Also, we provide a set of predefined set of merging rules that can be extended and adapted by a user. The query processor is based on a formal study on RSS query algebra that uses the notion of semantic similarity over dynamic content. The operators are supported by a set of similarity-based helper functions. We categorize the RSS operators into extraction, set membership and merge operators. The merge operator generalizes the join and the set membership operators. We also provide a set of query rewriting and equivalence rules that would be used during query simplification and optimization. Finally, we present a desktop prototype called Easy RSS Manager (EasyRSSManager) having a semanticaware RSS Reader, and semantic-aware and window-based RSS query components. It is designed to validate, demonstrate and test the practicability of the different proposals of this research. In particular, we test the timing complexity and the relevance of our approaches using both a real and syntactic dataset.
Document type :
Theses
Liste complète des métadonnées

Cited literature [164 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00589911
Contributor : Abes Star <>
Submitted on : Monday, May 2, 2011 - 4:54:07 PM
Last modification on : Wednesday, September 12, 2018 - 1:27:18 AM
Document(s) archivé(s) le : Wednesday, August 3, 2011 - 2:49:34 AM

File

these_A_TADDESSE_Fekade_Getahu...
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00589911, version 1

Collections

Citation

Fekade Getahun Taddesse. Semantic-aware news feeds management framework. Other [cs.OH]. Université de Bourgogne, 2010. English. ⟨NNT : 2010DIJOS036⟩. ⟨tel-00589911⟩

Share

Metrics

Record views

918

Files downloads

1029