Characterizing Web Syndication Behavior and Content

Abstract : We are witnessing a widespread of web syndication technologies such as RSS or Atom for a timely delivery of frequently updated Web content. Almost every personal weblog, news portal, or discussion forum employs nowadays RSS/Atom feeds for enhancing pull-oriented searching and browsing of web pages with push-oriented protocols of web content. Social media applications such as Twitter or Facebook also employ RSS for notifying users about the newly available posts of their preferred friends. Unfortunately, previous works on RSS/Atom statistical characteristics do not provide a precise and updated characterization of feeds' behavior and content, characterization which can be used to successfully benchmark effectiveness and efficiency of various RSS processing/analysis techniques. In this paper, we present the first thorough analysis of three complementary features of real-scale RSS feeds, namely, publication activity, items structure and length, as well as, vocabulary of its content which we believe are crucial for Web~2.0 applications.
Document type :
Conference papers
Complete list of metadatas

Cited literature [27 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-00737239
Contributor : Zeinab Hmedeh <>
Submitted on : Monday, October 1, 2012 - 2:07:57 PM
Last modification on : Saturday, February 8, 2020 - 8:32:39 PM
Long-term archiving on: Wednesday, January 2, 2013 - 6:30:11 AM

File

art_2163.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-00737239, version 1

Collections

Citation

Zeinab Hmedeh, Nelly Vouzoukidou, Nicolas Travers, Vassilis Christophides, Cédric Du Mouza, et al.. Characterizing Web Syndication Behavior and Content. WISE'11, The 12th International Conference on Web Information System Engineering, Oct 2011, Sydney, Australia. pp.29-42. ⟨hal-00737239⟩

Share

Metrics

Record views

194

Files downloads

210