Pipelined Parallelism in Multi-Join Queries on Heterogeneous Shared Nothing Architectures - Archive ouverte HAL Access content directly
Conference Papers Year : 2008

Pipelined Parallelism in Multi-Join Queries on Heterogeneous Shared Nothing Architectures

Abstract

Pipelined parallelism was largely studied and successfully implemented, on shared nothing machines, in several join algorithms in the presence of ideal conditions of load balancing between processors and in the absence of data skew. The aim of pipelining is to allow flexible resource allocation while avoiding unnecessary disk input/output for intermediate join results in the treatment of multi-join queries. The main drawback of pipelining in existing algorithms is that communication and load balancing remain limited to the use of static approaches (generated during query optimization phase) based on hashing to redistribute data over the network and therefore cannot solve data skew problem and load imbalance between processors on heterogeneous multi-processor architectures where the load of each processor may vary in a dynamic and unpredictable way. In this paper, we present a new parallel join algorithm allowing to solve the problem of data skew while guaranteeing perfect balancing properties, on heterogeneous multi-processor Shared Nothing architectures. The performance of this algorithm is analyzed using the scalable portable BSP (Bulk Synchronous Parallel) cost model.
No file

Dates and versions

hal-00460656 , version 1 (01-03-2010)

Identifiers

  • HAL Id : hal-00460656 , version 1

Cite

Mohamad Al Hajj Hassan, Mostafa Bamha. Pipelined Parallelism in Multi-Join Queries on Heterogeneous Shared Nothing Architectures. (ICSOFT'2008), Jul 2008, Porto, Portugal. pp.127-134. ⟨hal-00460656⟩
91 View
0 Download

Share

Gmail Facebook X LinkedIn More