HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities

Sarah Cohen-Boulakia 1, 2, 3, * Khalid Belhajjame 4 Olivier Collin 5 Jérôme Chopard 6 Christine Froidevaux 2 Alban Gaignard 7 Konrad Hinsen 8, 9 Pierre Larmande 3, 10, 1 Yvan Le Bras 5 Frédéric Lemoine 11 Fabien Mareuil 12, 13 Hervé Ménager 12, 13 Christophe Pradal 14, 15 Christophe Blanchet 16
* Corresponding author
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
5 Plateforme bioinformatique GenOuest [Rennes]
UR1 - Université de Rennes 1, Plateforme Génomique Santé Biogenouest®, Inria Rennes – Bretagne Atlantique , IRISA-D7 - GESTION DES DONNÉES ET DE LA CONNAISSANCE
15 VIRTUAL PLANTS - Modeling plant morphogenesis at different scales, from genes to phenotype
CRISAM - Inria Sophia Antipolis - Méditerranée , INRA - Institut National de la Recherche Agronomique, UMR AGAP - Amélioration génétique et adaptation des plantes méditerranéennes et tropicales
Abstract : With the development of new experimental technologies, biologists are faced with an avalanche of data to be computationally analyzed for scientific advancements and discoveries to emerge. Faced with the complexity of analysis pipelines, the large number of computational tools, and the enormous amount of data to manage, there is compelling evidence that many if not most scientific discoveries will not stand the test of time: increasing the reproducibility of computed results is of paramount importance. The objective we set out in this paper is to place scientific workflows in the context of reproducibility. To do so, we define several kinds of repro-ducibility that can be reached when scientific workflows are used to perform experiments. We characterize and define the criteria that need to be catered for by reproducibility-friendly scientific workflow systems, and use such criteria to place several representative and widely used workflow systems and companion tools within such a framework. We also discuss the remaining challenges posed by reproducible scientific workflows in the life sciences. Our study was guided by three use cases from the life science domain involving in silico experiments.
Complete list of metadata

Contributor : Sarah Cohen-Boulakia Connect in order to contact the contributor
Submitted on : Friday, April 28, 2017 - 4:22:06 PM
Last modification on : Tuesday, May 17, 2022 - 2:30:02 PM
Long-term archiving on: : Saturday, July 29, 2017 - 1:44:42 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution 4.0 International License



Sarah Cohen-Boulakia, Khalid Belhajjame, Olivier Collin, Jérôme Chopard, Christine Froidevaux, et al.. Scientific workflows for computational reproducibility in the life sciences: Status, challenges and opportunities. Future Generation Computer Systems, Elsevier, 2017, 75, pp.284-298. ⟨10.1016/j.future.2017.01.012⟩. ⟨hal-01516082⟩



Record views


Files downloads