Self-Management of Operational Issues for Grid Computing: The Case of the Virtual Imaging Platform

Raphaël Ferreira da Silva 1 Tristan Glatard 2 Frédéric Desprez 3
2 Images et Modèles
CREATIS - Centre de Recherche en Acquisition et Traitement de l'Image pour la Santé
3 AVALON - Algorithms and Software Architectures for Distributed and HPC Platforms
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : Science gateways, such as the Virtual Imaging Platform (VIP), enable transparent access to distributed computing and storage resources for scientific computations. However, their large scale and the number of middleware systems involved in these gateways lead to many errors and faults. This chapter addresses the autonomic management of workflow executions on science gateways in an online and non-clairvoyant environment, where the platform workload, task costs, and resource characteristics are unknown and not stationary. The chapter describes a general self-management process based on the MAPE-K loop (Monitoring, Analysis, Planning, Execution, and Knowledge) to cope with operational incidents of workflow executions. Then, this process is applied to handle late task executions, task granularities, and unfairness among workflow executions. Experimental results show how the approach achieves a fair quality of service by using control loops that constantly perform online monitoring, analysis, and execution of a set of curative actions.
Complete list of metadatas
Contributor : Béatrice Rayet <>
Submitted on : Thursday, February 11, 2016 - 11:10:49 AM
Last modification on : Thursday, June 6, 2019 - 3:54:19 PM



Raphaël Ferreira da Silva, Tristan Glatard, Frédéric Desprez. Self-Management of Operational Issues for Grid Computing: The Case of the Virtual Imaging Platform. Emerging Research in Cloud Distributed Computing Systems, Chapitre 6, pp.187-221, 2015, ⟨10.4018/978-1-4666-8213-9.ch006⟩. ⟨hal-01272649⟩



Record views