HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Topology-Aware and Dependence-Aware Scheduling and Memory Allocation for Task-Parallel Languages

Andi Drebes 1 Karine Heydemann 1 Nathalie Drach 1 Antoniu Pop 2 Albert Cohen 3, 4
1 ALSOC - Architecture et Logiciels pour Systèmes Embarqués sur Puce
LIP6 - Laboratoire d'Informatique de Paris 6
4 Parkas - Parallélisme de Kahn Synchrone
CNRS - Centre National de la Recherche Scientifique : UMR 8548, Inria Paris-Rocquencourt, DI-ENS - Département d'informatique - ENS Paris
Abstract : We present a joint scheduling and memory allocation algorithm for efficient execution of task-parallel programs on non-uniform memory architecture (NUMA) systems. Task and data placement decisions are based on a static description of the memory hierarchy and on runtime information about intertask communication. Existing locality-aware scheduling strategies for fine-grained tasks have strong limitations: they are specific to some class of machines or applications, they do not handle task dependences, they require manual program annotations, or they rely on fragile profiling schemes. By contrast, our solution makes no assumption on the structure of programs or on the layout of data in memory. Experimental results, based on the OpenStream language, show that locality of accesses to main memory of scientific applications can be increased significantly on a 64-core machine, resulting in a speedup of up to 1.63× compared to a state-of-the-art work-stealing scheduler.
Complete list of metadata

Contributor : Andi Drebes Connect in order to contact the contributor
Submitted on : Friday, March 27, 2015 - 2:01:00 PM
Last modification on : Thursday, March 17, 2022 - 10:08:44 AM

Links full text



Andi Drebes, Karine Heydemann, Nathalie Drach, Antoniu Pop, Albert Cohen. Topology-Aware and Dependence-Aware Scheduling and Memory Allocation for Task-Parallel Languages. ACM Transactions on Architecture and Code Optimization, Association for Computing Machinery, 2014, 11 (3), pp.30. ⟨10.1145/2641764⟩. ⟨hal-01136491⟩



Record views