Self-Management of Operational Issues for Grid Computing: The Case of the Virtual Imaging Platform

Raphaël Ferreira da Silva 1 Tristan Glatard 2 Frédéric Desprez 3
2 Images et Modèles
CREATIS - Centre de Recherche en Acquisition et Traitement de l'Image pour la Santé
3 AVALON - Algorithms and Software Architectures for Distributed and HPC Platforms
Inria Grenoble - Rhône-Alpes, LIP - Laboratoire de l'Informatique du Parallélisme
Abstract : Science gateways, such as the Virtual Imaging Platform (VIP), enable transparent access to distributed computing and storage resources for scientific computations. However, their large scale and the number of middleware systems involved in these gateways lead to many errors and faults. This chapter addresses the autonomic management of workflow executions on science gateways in an online and non-clairvoyant environment, where the platform workload, task costs, and resource characteristics are unknown and not stationary. The chapter describes a general self-management process based on the MAPE-K loop (Monitoring, Analysis, Planning, Execution, and Knowledge) to cope with operational incidents of workflow executions. Then, this process is applied to handle late task executions, task granularities, and unfairness among workflow executions. Experimental results show how the approach achieves a fair quality of service by using control loops that constantly perform online monitoring, analysis, and execution of a set of curative actions.
Type de document :
Chapitre d'ouvrage
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01272649
Contributeur : Béatrice Rayet <>
Soumis le : jeudi 11 février 2016 - 11:10:49
Dernière modification le : mercredi 20 novembre 2019 - 03:17:25

Identifiants

Citation

Raphaël Ferreira da Silva, Tristan Glatard, Frédéric Desprez. Self-Management of Operational Issues for Grid Computing: The Case of the Virtual Imaging Platform. Emerging Research in Cloud Distributed Computing Systems, Chapitre 6, pp.187-221, 2015, ⟨10.4018/978-1-4666-8213-9.ch006⟩. ⟨hal-01272649⟩

Partager

Métriques

Consultations de la notice

523