Characterizing communication and page usage of parallel applications for thread and data mapping - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Performance Evaluation Année : 2015

Characterizing communication and page usage of parallel applications for thread and data mapping

Résumé

The parallelism in shared-memory systems has increased significantly with the advent and evolution of multicore processors. Current systems include several multicore and multithreaded processors with Non-Uniform Memory Access (NUMA) characteristics. These architectures require the adoption of two strategies for the efficient execution of parallel applications: (i) threads sharing data should be placed in such a way in the memory hierarchy that they execute on shared caches; and (ii) a thread should have the data that it accesses placed on the NUMA node where it is executing. We refer to these techniques as thread and data mapping, respectively. Both strategies require knowledge of the application’s memory access behavior to identify the communication between threads and processes as well as their usage of memory pages. In this paper, we introduce a profiling method to establish the suitability of parallel applications for improved mappings that take the memory hierarchy into account, based on a mathematical description of their memory access behaviors. Experiments with a large set of parallel workloads that are based on a variety of parallel APIs (MPI, OpenMP, Pthreads, and MPI+OpenMP) show that most applications can benefit from improved mappings. We provide a mechanism to compute optimized thread and data mappings. Experimental results with this mechanism showed performance improvements of up to 54% (20% on average), as well as reductions of the energy consumption of up to 37% (11% on average), compared to the default mapping by the operating system. Furthermore, our results show that thread and data mapping have to be performed jointly in order to achieve optimal improvements.
Fichier non déposé

Dates et versions

hal-01146859 , version 1 (29-04-2015)

Identifiants

Citer

Matthias Diener, Eduardo Cruz, Laércio L. Pilla, Fabrice Dupros, Philippe Olivier Alexandre Navaux. Characterizing communication and page usage of parallel applications for thread and data mapping. Performance Evaluation, 2015, 88-89, pp.18-36. ⟨10.1016/j.peva.2015.03.001⟩. ⟨hal-01146859⟩

Collections

BRGM
154 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More