Improving the Performance of Batch Schedulers Using Online Job Size Classification - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2019

Improving the Performance of Batch Schedulers Using Online Job Size Classification

Résumé

Job scheduling in high-performance computing platforms is a hard problem that involves uncertainties on both the job arrival process and their execution time. Users typically provide a loose upper bound estimate for job execution times that are hardly useful. Previous studies attempted to improve these estimates using regression techniques. Although these attempts provide reasonable predictions, they require a long period of training data. Furthermore, aiming for perfect prediction may be of limited use for scheduling purposes. In this work, we propose a simpler approach by classifying jobs as small or large and prioritizing the execution of small jobs over large ones. Indeed, small jobs are the most impacted by queuing delays but they typically represent a light load and incur a small burden on the other jobs. The classifier operates online and learns by using data collected over the previous weeks, facilitating its deployment and enabling fast adaptations to changes in workload characteristics. We evaluate our approach using four scheduling policies on six HPC platform workload traces. We show that: (i) incorporating such classification reduces the average bounded slowdown of jobs in all scenarios, and (ii) the obtained improvements are comparable, in most scenarios, to the ideal hypothetical situation where the scheduler would know the exact running time of jobs in advance.
Fichier principal
Vignette du fichier
IEEE_IPDPS_2020_pre_print.pdf (898.52 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02334116 , version 1 (25-10-2019)

Identifiants

  • HAL Id : hal-02334116 , version 1

Citer

Salah Zrigui, Raphael y de Camargo, Denis Trystram, Arnaud Legrand. Improving the Performance of Batch Schedulers Using Online Job Size Classification. 2019. ⟨hal-02334116⟩
163 Consultations
221 Téléchargements

Partager

Gmail Facebook X LinkedIn More