HeatPipe: High Throughput, Low Latency Big Data Heatmap with Spark Streaming

Heatmap visualization is a well-known type of visual-ization to alleviate the overplot problem of point visualiza-tion. As such, it is well suited to visualize Big Data. In order to tackle the velocity problem of Big Data, one has to leverage streaming computations. Recently, canopy clustering was shown to be well suited for Big Data heatmap visualization. In this article, we present how to design a streaming algorithm to compute canopy clustering using Apache Spark. This result is directly applicable to be included into a lambda architecture.

Mots clés

Big Data Information Visualization Heatmap Lambda Architecture

Domaines

Calcul parallèle, distribué et partagé [cs.DC]

Fichier principal

iv2017_heatmapStreaming(4).pdf (729.99 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Perrot Alexandre : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01516888

Soumis le : mardi 18 juillet 2017-16:21:50

Dernière modification le : vendredi 24 mars 2023-14:53:04

Archivage à long terme le : samedi 27 janvier 2018-07:40:17

Dates et versions

hal-01516888 , version 1 (18-07-2017)

Identifiants

HAL Id : hal-01516888 , version 1

Citer

Alexandre Perrot, Romain Bourqui, Nicolas Hanusse, David Auber. HeatPipe: High Throughput, Low Latency Big Data Heatmap with Spark Streaming. IV2017 - 21st International Conference on Information Visualisation, Jul 2017, Londres, United Kingdom. ⟨hal-01516888⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LABRI-MABIOVIS

212 Consultations

582 Téléchargements