Towards a control-theory approach for minimizing unused grid resources

Emmanuel Stahl 1 Agustín Yabo 2 Olivier Richard 3 Bruno Bzeznik 4 Bogdan Robu 5 Eric Rutten 6
2 BIOCORE - Biological control of artificial ecosystems
CRISAM - Inria Sophia Antipolis - Méditerranée , INRA - Institut National de la Recherche Agronomique, LOV - Laboratoire d'océanographie de Villefranche
5 GIPSA-SYSCO - SYSCO
GIPSA-DA - Département Automatique
6 CTRL-A - Control techniques for Autonomic, adaptive and Reconfigurable Computing systems
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : HPC systems are facing more and more variability in their behavior, related to e.g., performance and power consumption, and the fact that they are less predictable requires more runtime management. This can be done in an Autonomic Management feedback loop, in response to monitored information in the systems, by analysis of this data and utilization of the results in order to activate appropriate system-level or application-level feedback mechanisms (e.g., informing schedulers, down-clocking CPUs). One such problem is found in the context of CiGri, a simple, lightweight, scalable and fault tolerant grid system which exploits the unused resources of a set of computing clusters. Computing power left over by the execution of a main HPC application scheduling is used to execute smaller jobs, which are injected as much as the global system allows. This paper presents rst results addressing the problem of au- tomated resource management in an HPC infrastructure, using techniques from Control Theory to design a controller that maximizes cluster utilization while avoiding overload. We put in place a mechanism for feedback (Proportional Integral, PI) control system software, through a maximum number of jobs to be sent to the cluster, in response to system information about the current number of jobs processed.
Type de document :
Communication dans un congrès
AI-Science'18 - workshop on Autonomous Infrastructure for Science, in conjunction with the ACM HPDC 2018, Jun 2018, Tempe, AZ, United States. pp.1-8, 〈http://www.hpdc.org/2018/〉. 〈10.1145/3217197.3217201〉
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01823787
Contributeur : Eric Rutten <>
Soumis le : mardi 26 juin 2018 - 14:35:30
Dernière modification le : jeudi 18 octobre 2018 - 01:16:30

Identifiants

Citation

Emmanuel Stahl, Agustín Yabo, Olivier Richard, Bruno Bzeznik, Bogdan Robu, et al.. Towards a control-theory approach for minimizing unused grid resources. AI-Science'18 - workshop on Autonomous Infrastructure for Science, in conjunction with the ACM HPDC 2018, Jun 2018, Tempe, AZ, United States. pp.1-8, 〈http://www.hpdc.org/2018/〉. 〈10.1145/3217197.3217201〉. 〈hal-01823787〉

Partager

Métriques

Consultations de la notice

122