PeNeLoop: Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments

Abstract : Replicating data fragments in Linked Data improves data availability and performances of federated query engines. Existing replication aware federated query engines mainly focus on source selection and query decomposition in order to prune redundant sources and reduce intermediate results thanks to data locality. In this paper, we extend replication-aware federated query engines with a replication-aware parallel join operator: PeNeLoop. PeNeLoop exploits redundant sources to parallelize the join operator and reduce execution time. We implemented PeNeLoop in the federated query engine FedX with the replicated-aware source selection Fedra and we empirically evaluated the performance of FedX+Fedra+PeNeLoop. Experimental results suggest that FedX+Fedra+PeNeLoop outperforms FedX+Fedra in terms of execution time while preserving answer completeness.
Complete list of metadatas

Cited literature [17 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01549751
Contributor : Thomas Minier <>
Submitted on : Thursday, July 6, 2017 - 3:34:57 PM
Last modification on : Tuesday, March 26, 2019 - 5:05:54 PM
Long-term archiving on : Thursday, December 14, 2017 - 7:05:23 PM

File

paper_peneloop.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01549751, version 1

Collections

Citation

Thomas Minier, Gabriela Montoya, Hala Molli, Pascal Molli. PeNeLoop: Parallelizing Federated SPARQL Queries in Presence of Replicated Fragments. Querying the Web of Data (QuWeDa 2017) Workshop, co-located with 14th ESWC 2017 (Awarded Best workshop paper), Muhammad Saleem; Ricardo Usbeck; Ruben Verborgh; Axel-Cyrille Ngonga Ngomo, May 2017, Portorož, Slovenia. pp.37-50. ⟨hal-01549751⟩

Share

Metrics

Record views

1153

Files downloads

137