Parallelizing Federated SPARQL Queries in Presence of Replicated Data

Federated query engines have been enhanced to exploit new data localities created by replicated data, e.g., Fedra. However, existing replication aware federated query engines mainly focus on pruning sources during the source selection and query decomposition in order to reduce intermediate results thanks to data locality. In this paper, we implement a replication-aware parallel join operator: Pen. This operator can be used to exploit replicated data during query execution. For existing replication-aware federated query engines, this operator exploits replicated data to parallelize the execution of joins and reduce execution time. For Triple Pattern Fragment (TPF) clients, this operator exploits the availability of several TPF servers exposing the same dataset to share the load among the servers. We implemented Pen in the federated query engine FedX with the replicated-aware source selection Fedra and in the reference TPF client. We empirically evaluated the performance of engines extended with the Pen operator and the experimental results suggest that our extensions outperform the existing approaches in terms of execution time and balance of load among the servers, respectively.

Mots clés

Parallel Query Processing Linked Data Fragment Replication Load Balancing Federated SPARQL Queries Processing Triple Pattern Fragment

Domaines

Base de données [cs.DB] Calcul parallèle, distribué et partagé [cs.DC] Algorithme et structure de données [cs.DS] Web

Fichier principal

paper_peneloop_preprint.pdf (439.21 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Thomas Minier : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01591791

Soumis le : mercredi 3 janvier 2018-16:36:07

Dernière modification le : vendredi 24 mars 2023-14:53:06

Archivage à long terme le : jeudi 3 mai 2018-06:17:21

Dates et versions

hal-01591791 , version 1 (22-09-2017)

hal-01591791 , version 2 (03-01-2018)

Identifiants

HAL Id : hal-01591791 , version 2
DOI : 10.1007/978-3-319-70407-4_33

Citer

Thomas Minier, Gabriela Montoya, Hala Skaf-Molli, Pascal Molli. Parallelizing Federated SPARQL Queries in Presence of Replicated Data. 14th ESWC 2017, May 2017, Portoroz, Slovenia. pp.181-196, ⟨10.1007/978-3-319-70407-4_33⟩. ⟨hal-01591791v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-NANTES INSTITUT-TELECOM CNRS EC-NANTES UNAM LS2N LS2N-GDD NANTES-UNIVERSITE

218 Consultations

333 Téléchargements