Service interruption on Monday 11 July from 12:30 to 13:00: all the sites of the CCSD (HAL, EpiSciences, SciencesConf, AureHAL) will be inaccessible (network hardware connection).
Skip to Main content Skip to Navigation
Conference papers

On the Cost of Acking in Data Stream Processing Systems

Abstract : The widespread use of social networks and applications such as IoT networks generates a continuous stream of data that companies and researchers want to process, ideally in real-time. Data stream processing systems (DSP) enable such continuous data analysis by implementing the set of operations to be performed on the stream as directed acyclic graph (DAG) of tasks. While these DSP systems embed mechanisms to ensure fault tolerance and message reliability, only few studies focus on the impact of these mechanisms on the performance of applications at runtime. In this paper, we demonstrate the impact of the message reliability mechanism on the performance of the application. We use an experimental approach, using the Storm middleware, to study an acknowledgment-based framework. We compare the two standard schedulers available in Storm with applications of various degrees of parallelism, over single and multi cluster scenarios. We show that the acking layer may create an unforeseen bottleneck due to the acking tasks placement; a problem which, to the best of our knowledge, has been overlooked in the scientific and technical literature. We propose two strategies for improving the acking tasks placement and demonstrate their benefit in terms of throughput and latency.
Complete list of metadata

Cited literature [32 references]  Display  Hide  Download
Contributor : Alessio Pagliari Connect in order to contact the contributor
Submitted on : Monday, May 20, 2019 - 4:33:39 PM
Last modification on : Sunday, June 26, 2022 - 2:38:33 AM


Files produced by the author(s)




Alessio Pagliari, Fabrice Huet, Guillaume Urvoy-Keller. On the Cost of Acking in Data Stream Processing Systems. 2019 19th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing (CCGRID), May 2019, Larnaca, Cyprus. ⟨10.1109/CCGRID.2019.00047⟩. ⟨hal-02134654⟩



Record views


Files downloads