Skip to Main content Skip to Navigation
Conference papers

An End-User Pipeline for Scraping and Visualizing Semi-Structured Data over the Web

Abstract : The Web is a vast source of semi-structured datasets that are made readily available to support the construction of new knowledge. Information visualization techniques have been demonstrated as a suitable alternative for allowing users to analyze and understand a large amount of data. However, the steps required for visualizing semi-structured data obtained from the Web is not straightforward, and it requires proper treatment before information visualization techniques could be applied. In this work, we present a visualization pipeline for describing the fundamental operations required for visualizing semi-structured data over the Web. We employ Web Scraping and Web Augmentation techniques for supporting interactive visualizations and solving tasks without changing the context of use of the data. Our approach is duly supported by a framework including scraping-, augmenting- and visualization-tools and it has been applied to different kinds of websites to demonstrate its validity and feasibility. Our ultimate goal is to expand the limits of our technology for improving the user interaction with websites and creating new experiences for a better understanding of large datasets.
Complete list of metadata
Contributor : Elöd Egyed-Zsigmond Connect in order to contact the contributor
Submitted on : Wednesday, July 10, 2019 - 3:31:11 PM
Last modification on : Sunday, June 26, 2022 - 2:39:39 AM


  • HAL Id : hal-02179226, version 1


Gabriela Bosseti, Firmenich Sergio, Marco Winckler, Gustavo Rossi, Fandos, Ulises, et al.. An End-User Pipeline for Scraping and Visualizing Semi-Structured Data over the Web. International Conference on Web Engineering, Jun 2019, Daejeon, South Korea. pp.223--237. ⟨hal-02179226⟩



Record views