On the Fly Detection of the Top-k Items in the Distributed Sliding Window Model

Abstract : This paper presents a new algorithm that detects on the fly the k most frequent items in the sliding window model. This algorithm is distributed among the nodes of the system. It is inspired by a recent and innovative approach, which consists in associating a stochastic value correlated with the item's frequency instead of trying to estimate its number of occurrences. This stochastic value corresponds to the number of consecutive heads in coin flipping until the first tail occurs. The original approach was to retain just the maximum of consecutive heads obtained by an item, since an item that often occurs will have a higher probability of having a high value. While effective for very skewed data distributions, the correlation is not tight enough to robustly distinguish items with comparable frequencies. To address this important issue, we propose to combine the stochastic approach together with a deterministic counting of items. Specifically, in place of keeping the maximum number of consecutive heads obtained by an item, we count the number of times the coin flipping process of an item has exceeded a given threshold. This threshold is defined by combining theoretical results in leader election and coupon collector problems. Results on simulated data show how impressive is the detection of the top-k items in a large range of distributions.
Type de document :
Communication dans un congrès
NCA 2018 - 17th IEEE International Symposium on Network Computing and Applications, Nov 2018, Boston, United States. IEEE, pp.1-8, 〈10.1109/NCA.2018.8548097〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01888298
Contributeur : Emmanuelle Anceaume <>
Soumis le : lundi 8 octobre 2018 - 09:49:04
Dernière modification le : mardi 12 février 2019 - 13:50:19
Document(s) archivé(s) le : mercredi 9 janvier 2019 - 13:06:08

Fichier

NCA_2018.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Emmanuelle Anceaume, Yann Busnel, Vasile Cazacu. On the Fly Detection of the Top-k Items in the Distributed Sliding Window Model. NCA 2018 - 17th IEEE International Symposium on Network Computing and Applications, Nov 2018, Boston, United States. IEEE, pp.1-8, 〈10.1109/NCA.2018.8548097〉. 〈hal-01888298〉

Partager

Métriques

Consultations de la notice

291

Téléchargements de fichiers

81