Scheduling Data Flow Program in XKaapi: A New Affinity Based Algorithm for Heterogeneous Architectures

Abstract : Efficient implementations of parallel applications on hetero-geneous hybrid architectures require a careful balance between compu-tations and communications with accelerator devices. Even if most of the communication time can be overlapped by computations, it is es-sential to reduce the total volume of communicated data. The litera-ture therefore abounds with ad hoc methods to reach that balance, but these are architecture and application dependent. We propose here a generic mechanism to automatically optimize the scheduling between CPUs and GPUs, and compare two strategies within this mechanism: the classical Heterogeneous Earliest Finish Time (HEFT) algorithm and our new, parametrized, Distributed Affinity Dual Approximation algo-rithm (DADA), which consists in grouping the tasks by affinity before running a fast dual approximation. We ran experiments on a heteroge-neous parallel machine with twelve CPU cores and eight NVIDIA Fermi GPUs. Three standard dense linear algebra kernels from the PLASMA library have been ported on top of the XKaapi runtime system. We re-port their performances. It results that HEFT and DADA perform well for various experimental conditions, but that DADA performs better for larger systems and number of GPUs, and, in most cases, generates much lower data transfers than HEFT to achieve the same performance.
Liste complète des métadonnées

Cited literature [17 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01081629
Contributor : Grégory Mounié <>
Submitted on : Wednesday, November 12, 2014 - 6:18:00 PM
Last modification on : Thursday, April 4, 2019 - 10:18:05 AM
Document(s) archivé(s) le : Friday, February 13, 2015 - 10:20:42 AM

File

europar2014.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Raphaël Bleuse, Thierry Gautier, João V. F. Lima, Grégory Mounié, Denis Trystram. Scheduling Data Flow Program in XKaapi: A New Affinity Based Algorithm for Heterogeneous Architectures. Euro-Par 2014 Parallel Processing - 20th International Conference, Aug 2014, Porto, Portugal. pp.560 - 571, ⟨10.1007/978-3-319-09873-9_47⟩. ⟨hal-01081629⟩

Share

Metrics

Record views

914

Files downloads

192