SHARP: Harmonizing Galaxy and Taverna workflow provenance

Abstract : SHARP is a Linked Data approach for harmonizing cross-workflow provenance. In this demo, we demonstrate SHARP through a real-world omic experiment involving workflow traces generated by Taverna and Galaxy systems. SHARP starts by interlinking provenance traces generated by Galaxy and Taverna workflows and then harmonize the interlinked graphs thanks to OWL and PROV inference rules. The resulting provenance graph can be exploited for answering queries across Galaxy and Taverna workflow runs. PROV has been adopted by a number of workflow systems for encoding the traces of workflow executions. Exploiting these prove-nance traces is hampered by two main impediments. Firstly, workflow systems extend PROV differently to cater for system-specific constructs. The difference between the adopted PROV extensions yields heterogeneity in the generated provenance traces. This heterogeneity diminishes the value of such traces, e.g. when combining and querying provenance traces of different workflow systems. Secondly, the provenance recorded by workflow systems tends to be large, and as such difficult to browse and understand by a human user. In this paper, we propose SHARP, a Linked Data approach for harmonizing cross-workflow provenance. The harmonization is performed by chasing tuple-generating and equality-generating dependencies defined for workflow provenance. This results in a provenance graph that can be summarized using domain-specific vocabularies. We experimentally evaluate the effectiveness of SHARP using a real-world omic experiment involving workflow traces generated by the Taverna and Galaxy systems.
Document type :
Conference papers
Complete list of metadatas

Cited literature [21 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01768394
Contributor : Alban Gaignard <>
Submitted on : Tuesday, April 17, 2018 - 11:16:31 AM
Last modification on : Wednesday, June 26, 2019 - 11:02:39 AM

File

sharp-paper-preprint.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01768394, version 1

Citation

Alban Gaignard, Khalid Belhajjame, Hala Skaf-Molli. SHARP: Harmonizing Galaxy and Taverna workflow provenance. SeWeBMeDA 2017 : Semantic Web solutions for large-scale BioMedical Data Analtics - 14th ESWC 2017, May 2017, Portoroz, Slovenia. ⟨hal-01768394⟩

Share

Metrics

Record views

129

Files downloads

40