From scientific workflow patterns to 5-star linked open data

Abstract : Scientific Workflow management systems have been largely adopted by data-intensive science communities. Many efforts have been dedicated to the representation and exploitation of prove-nance to improve reproducibility in data-intensive sciences. However , few works address the mining of provenance graphs to annotate the produced data with domain-specific context for better interpretation and sharing of results. In this paper, we propose PoeM, a lightweight framework for mining provenance in scientific workflows. PoeM allows to produce linked in silico experiment reports based on workflow runs. PoeM leverages semantic web technologies and reference vocabularies (PROV-O, P-Plan) to generate provenance mining rules and finally assemble linked scientific experiment reports (Micropublications, Experimental Factor Ontology). Preliminary experiments demonstrate that PoeM enables the querying and sharing of Galaxy 1-processed genomic data as 5-star linked datasets.
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01768449
Contributor : Alban Gaignard <>
Submitted on : Tuesday, April 17, 2018 - 11:41:46 AM
Last modification on : Friday, April 5, 2019 - 9:22:03 AM

File

tapp16-paper-gaignard.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01768449, version 1

Collections

Citation

Alban Gaignard, Hala Skaf-Molli, Audrey Bihouée. From scientific workflow patterns to 5-star linked open data. TaPP 2016: 8th USENIX Workshop on the Theory and Practice of Provenance, Jun 2016, Washington D.C., United States. ⟨hal-01768449⟩

Share

Metrics

Record views

86

Files downloads

49