PAXQuery: Efficient Parallel Processing of Complex XQuery

Jesús Camacho-Rodríguez 1 Dario Colazzo 2 Ioana Manolescu 3, 4
3 OAK - Database optimizations and architectures for complex large data
CNRS - Centre National de la Recherche Scientifique : UMR8623, Inria Saclay - Ile de France, UP11 - Université Paris-Sud - Paris 11, LRI - Laboratoire de Recherche en Informatique
Abstract : Increasing volumes of data are being produced and exchanged over the Web, in particular in tree-structured formats such as XML or JSON. This leads to a need of highly scalable algorithms and tools for processing such data, capable to take advantage of massively parallel processing platforms. This work considers the problem of efficiently parallelizing the execution of complex nested data processing, expressed in XQuery. We provide novel algorithms showing how to translate such queries into PACT, a recent framework generalizing MapReduce in particular by supporting many-input tasks. We present the first formal translation of complex XQuery algebraic expressions into PACT plans, and demonstrate experimentally the efficiency and scalability of our approach.
Document type :
Journal articles
Complete list of metadatas

Cited literature [42 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01162929
Contributor : Jesús Camacho-Rodríguez <>
Submitted on : Thursday, June 11, 2015 - 4:26:40 PM
Last modification on : Monday, May 28, 2018 - 2:38:02 PM
Long-term archiving on : Saturday, September 12, 2015 - 11:06:02 AM

File

TKDE2391110.pdf
Files produced by the author(s)

Identifiers

Citation

Jesús Camacho-Rodríguez, Dario Colazzo, Ioana Manolescu. PAXQuery: Efficient Parallel Processing of Complex XQuery. IEEE Transactions on Knowledge and Data Engineering, Institute of Electrical and Electronics Engineers, 2015, 27 (7), pp.1977 - 1991. ⟨http://www.computer.org/web/tkde⟩. ⟨10.1109/TKDE.2015.2391110⟩. ⟨hal-01162929⟩

Share

Metrics

Record views

417

Files downloads

267