Groupement de clés efficace pour un équilibrage de charge quasi-optimal dans les systèmes de traitement de flux

Abstract : Key grouping is a technique used by stream processing frame- works to simplify the development of parallel stateful opera- tors. Through key grouping a stream of tuples is partitioned in several disjoint sub-streams depending on the values con- tained in the tuples themselves. Each operator instance tar- get of one sub-stream is guaranteed to receive all the tuples containing a specific key value. A common solution to imple- ment key grouping is through hash functions that, however, are known to cause load imbalances on the target operator instances when the input data stream is characterized by a skewed value distribution. In this paper we present DKG, a novel approach to key grouping that provides near-optimal load distribution for input streams with skewed value distri- bution. DKG starts from the simple observation that with such inputs the load balance is strongly driven by the most frequent values; it identifies such values and explicitly maps them to sub-streams together with groups of less frequent items to achieve a near-optimal load balance. We provide theoretical approximation bounds for the quality of the map- ping derived by DKG and show, through both simulations and a running prototype, its impact on stream processing applications.
Complete list of metadatas

Cited literature [3 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01303887
Contributor : Yann Busnel <>
Submitted on : Tuesday, April 19, 2016 - 6:27:31 PM
Last modification on : Thursday, February 7, 2019 - 2:35:14 PM
Long-term archiving on : Tuesday, November 15, 2016 - 5:52:56 AM

File

rqabs-dkg-algotel15.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01303887, version 1

Citation

Nicoló Rivetti, Leonardo Querzoni, Emmanuelle Anceaume, Yann Busnel, Bruno Sericola. Groupement de clés efficace pour un équilibrage de charge quasi-optimal dans les systèmes de traitement de flux. ALGOTEL 2016 - 18èmes Rencontres Francophones sur les Aspects Algorithmiques des Télécommunications, May 2016, Bayonne, France. ⟨hal-01303887⟩

Share

Metrics

Record views

1771

Files downloads

100