Efficiently Summarizing Distributed Data Streams over Sliding Windows - Archive ouverte HAL Accéder directement au contenu
Rapport Année : 2014

Efficiently Summarizing Distributed Data Streams over Sliding Windows

Résumé

Estimating the frequency of any piece of information in large-scale distributed data streams became of utmost importance in the last decade (\emph{e.g.}, in the context of network monitoring, big data, \emph{etc.}). If some elegant solutions have been proposed recently, their approximation is computed from the inception of the stream. In a runtime distributed context, one would prefer to gather information only about the recent past. In this paper, we consider the \emph{sliding window functional monitoring} model and propose two different (on-line) algorithms that $(\varepsilon,\delta)$-approximate the items frequency in the active window. They use a very small amount of memory with respect to the size of the window $N$ and the number of distinct items $n$ of the stream: namely $O(\frac{1}{\varepsilon} \log \frac{1}{\delta} (\log N + \log n))$ and $O(\frac{1}{\tau\varepsilon} \log \frac{1}{\delta} (\log N + \log n))$ bits of space, where $\tau$ is a parameter limiting memory usage. We also provide their distributed variant with a communication cost of $O(\frac{k}{\varepsilon^2} \log \frac{1}{\delta} \log N)$ bits per window (where $k$ is the number of nodes). Experiments on synthetic traces and real data sets validate the robustness and accuracy of our algorithms.
Fichier principal
Vignette du fichier
tr1014.pdf (665.11 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-01073877 , version 1 (10-10-2014)
hal-01073877 , version 2 (15-06-2015)
hal-01073877 , version 3 (30-06-2015)

Identifiants

  • HAL Id : hal-01073877 , version 1

Citer

Nicolò Rivetti, Yann Busnel, Achour Mostefaoui. Efficiently Summarizing Distributed Data Streams over Sliding Windows. 2014. ⟨hal-01073877v1⟩
722 Consultations
698 Téléchargements

Partager

Gmail Facebook X LinkedIn More