Pipelined parallelism for multi-join queries on shared nothing machines - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2003

Pipelined parallelism for multi-join queries on shared nothing machines

Résumé

The development of scalable parallel database systems requires the design of efficient algorithms especially for the join which is the most frequent and expensive operation in relational database systems. Join is also the most vulnerable operation to data skew and to the high cost of communication in distributed architectures. Moreover, for multi-join queries, the problem of data-skew is more complicated because the imbalance of intermediate results is unknown during static query optimization. In this paper, we show that the join algorithms we presented in our earlier papers, can be applied efficiently in various parallel execution strategies making it possible to exploit not only intra-operator parallelism but also inter-operator parallelism. These algorithms reduce the communication and synchronization costs to a minimum while guaranteeing a perfect load balancing during all the stages of join computation even for highly skewed data.
Fichier non déposé

Dates et versions

hal-00081353 , version 1 (22-06-2006)

Identifiants

  • HAL Id : hal-00081353 , version 1

Citer

Mostafa Bamha, Matthieu Exbrayat. Pipelined parallelism for multi-join queries on shared nothing machines. (ParCo 2003), 2003, France. ⟨hal-00081353⟩
37 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More