Skip to Main content Skip to Navigation
Conference papers

PAXQuery: Parallel Analytical XML Processing

Jesús Camacho-Rodríguez 1 Dario Colazzo 2 Ioana Manolescu 3, 4 Juan A. M. Naranjo 3, 4
3 OAK - Database optimizations and architectures for complex large data
LRI - Laboratoire de Recherche en Informatique, UP11 - Université Paris-Sud - Paris 11, Inria Saclay - Ile de France, CNRS - Centre National de la Recherche Scientifique : UMR8623
Abstract : XQuery is a general-purpose programming language for processing semi-structured data, and as such, it is very expressive. As a consequence , optimizing and parallelizing complex analytics XQuery queries is still an open, challenging problem. We demonstrate PAXQuery, a novel system that parallelizes the execution of XQuery queries over large collections of XML documents. PAXQuery compiles a rich subset of XQuery into plans expressed in the PArallelization ConTracts (PACT) programming model. Thanks to this translation, the resulting plans are optimized and executed in a massively parallel fashion by the Apache Flink system. The result is a scalable system capable of querying massive amounts of XML data very efficiently, as proved by the experimental results we outline.
Document type :
Conference papers
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download
Contributor : Jesús Camacho-Rodríguez <>
Submitted on : Monday, July 20, 2015 - 11:27:55 AM
Last modification on : Wednesday, September 23, 2020 - 4:29:12 AM
Long-term archiving on: : Wednesday, October 21, 2015 - 5:12:35 PM


Publisher files allowed on an open archive



Jesús Camacho-Rodríguez, Dario Colazzo, Ioana Manolescu, Juan A. M. Naranjo. PAXQuery: Parallel Analytical XML Processing. ACM SIGMOD International Conference on Management of Data 2015, May 2015, Melbourne, Victoria, Australia. pp.1117-1122, ⟨10.1145/2723372.2735374⟩. ⟨hal-01178490⟩



Record views


Files downloads