Tuning EASY-Backfilling Queues

Jérôme Lelong 1 Valentin Reis 1, 2 Denis Trystram 2, 3
1 DAO - Données, Apprentissage et Optimisation
LJK - Laboratoire Jean Kuntzmann
2 DATAMOVE - Data Aware Large Scale Computing
Inria Grenoble - Rhône-Alpes, LIG - Laboratoire d'Informatique de Grenoble
Abstract : EASY-Backfilling is a popular scheduling heuristic for allocating jobs in large scale High Performance Computing platforms. While its aggressive reservation mechanism is fast and prevents job starvation, it does not try to optimize any scheduling objective per se. We consider in this work the problem of tuning EASY using queue reordering policies. More precisely, we propose to tune the reordering using a simulation-based methodology. For a given system, we choose the policy in order to minimize the average waiting time. This methodology departs from the First-Come, First-Serve rule and introduces a risk on the maximum values of the waiting time, which we control using a queue thresholding mechanism. This new approach is evaluated through a comprehensive experimental campaign on five production logs. In particular, we show that the behavior of the systems under study is stable enough to learn a heuristic that generalizes in a train/test fashion. Indeed, the average waiting time can be reduced consistently (between 11% to 42% for the logs used) compared to EASY, with almost no increase in maximum waiting times. This work departs from previous learning-based approaches and shows that scheduling heuristics for HPC can be learned directly in a policy space.
Type de document :
Communication dans un congrès
21st Workshop on Job Scheduling Strategies for Parallel Processing, May 2017, Orlando, United States. Springer, Lecture Notes in Computer Science, 10773, pp.43-61, 2018, 31st IEEE International Parallel & Distributed Processing Symposium 〈http://www.jsspp.org/〉. 〈10.1007/978-3-319-77398-8_3〉
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01522459
Contributeur : Valentin Reis <>
Soumis le : lundi 15 mai 2017 - 10:23:12
Dernière modification le : jeudi 11 octobre 2018 - 08:48:05
Document(s) archivé(s) le : jeudi 17 août 2017 - 00:20:52

Fichier

paper.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Jérôme Lelong, Valentin Reis, Denis Trystram. Tuning EASY-Backfilling Queues. 21st Workshop on Job Scheduling Strategies for Parallel Processing, May 2017, Orlando, United States. Springer, Lecture Notes in Computer Science, 10773, pp.43-61, 2018, 31st IEEE International Parallel & Distributed Processing Symposium 〈http://www.jsspp.org/〉. 〈10.1007/978-3-319-77398-8_3〉. 〈hal-01522459〉

Partager

Métriques

Consultations de la notice

548

Téléchargements de fichiers

316