Decomposition and Sharing User-defined Aggregation: from Theory to Practice

Abstract : We study the problems of decomposing and sharing user-defined aggregate functions in distributed and parallel computing. Aggre-gation usually needs to satisfy the distributive property to compute in parallel, and to leverage optimization in multidimensional data analysis and conjunctive query with aggregation. However, this property is too restricted to allow more aggregation to benefit from these advantages. We propose for user-defined aggregation functions a formal framework to relax the previous condition, and we map this framework to the MRC, an efficient computation model in MapReduce, to automatically generate efficient partial aggrega-tion functions. Moreover, we identify the complete conditions for sharing the result of practical user-defined aggregation without scanning base data, and propose a hybrid solution, the symbolic index, pull-up rules, to optimize the sharing process.
Document type :
Preprints, Working Papers, ...
Complete list of metadatas

Cited literature [30 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01877088
Contributor : Chao Zhang <>
Submitted on : Thursday, October 18, 2018 - 11:26:52 AM
Last modification on : Monday, January 20, 2020 - 12:12:06 PM

File

report_ZHANGChao.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01877088, version 2

Citation

Chao Zhang, Farouk Toumani. Decomposition and Sharing User-defined Aggregation: from Theory to Practice. 2018. ⟨hal-01877088v2⟩

Share

Metrics

Record views

95

Files downloads

212