HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Dual tree traversal on integrated GPUs for astrophysical N-body simulations

Abstract : In astrophysical N-body simulations, O(N) fast multipole methods (FMMs) with dual tree traversal (DTT) on multi-core CPUs are faster than O(N log N) CPU tree-codes but can still be outperformed by GPU ones. In this paper, we aim at combining the best algorithm , namely FMM with DTT, with the most powerful hardware currently available, namely GPUs. In the astrophysical context requiring low accuracies and non-uniform particle distributions, we show that such combination can be achieved thanks to an hybrid CPU-GPU algorithm on integrated GPUs: while the DTT is performed on the CPU cores, the far-and near-field computations are all performed on the GPU cores. We show how to efficiently expose the interactions resulting from the DTT to the GPU cores, how to deploy both the far-and near-field computations on GPU and how to overlap the parallel DTT on CPU with GPU computations. Based on the falcON code and using OpenCL on AMD Accelerated Processing Units and on Intel integrated GPUs, this first heterogeneous deployment of DTT for FMM outperforms standard multi-core CPUs, and matches GPU and high-end CPU performance, being hence more cost-and power-efficient.
Complete list of metadata

Cited literature [31 references]  Display  Hide  Download

Contributor : Pierre Fortin Connect in order to contact the contributor
Submitted on : Thursday, July 23, 2020 - 4:13:37 PM
Last modification on : Thursday, March 24, 2022 - 3:43:08 AM
Long-term archiving on: : Tuesday, December 1, 2020 - 6:28:04 AM


Files produced by the author(s)



Pierre Fortin, Maxime Touche. Dual tree traversal on integrated GPUs for astrophysical N-body simulations. International Journal of High Performance Computing Applications, SAGE Publications, 2019, 33 (5), pp.960-972. ⟨10.1177/1094342019840806⟩. ⟨hal-02073710⟩



Record views


Files downloads