Skip to Main content Skip to Navigation
Journal articles

Dual tree traversal on integrated GPUs for astrophysical N-body simulations

Abstract : In astrophysical N-body simulations, O(N) fast multipole methods (FMMs) with dual tree traversal (DTT) on multi-core CPUs are faster than O(N log N) CPU tree-codes but can still be outperformed by GPU ones. In this paper, we aim at combining the best algorithm , namely FMM with DTT, with the most powerful hardware currently available, namely GPUs. In the astrophysical context requiring low accuracies and non-uniform particle distributions, we show that such combination can be achieved thanks to an hybrid CPU-GPU algorithm on integrated GPUs: while the DTT is performed on the CPU cores, the far-and near-field computations are all performed on the GPU cores. We show how to efficiently expose the interactions resulting from the DTT to the GPU cores, how to deploy both the far-and near-field computations on GPU and how to overlap the parallel DTT on CPU with GPU computations. Based on the falcON code and using OpenCL on AMD Accelerated Processing Units and on Intel integrated GPUs, this first heterogeneous deployment of DTT for FMM outperforms standard multi-core CPUs, and matches GPU and high-end CPU performance, being hence more cost-and power-efficient.
Complete list of metadatas

Cited literature [31 references]  Display  Hide  Download

https://hal.sorbonne-universite.fr/hal-02073710
Contributor : Pierre Fortin <>
Submitted on : Thursday, July 23, 2020 - 4:13:37 PM
Last modification on : Monday, July 27, 2020 - 3:06:11 PM

File

article-HAL.pdf
Files produced by the author(s)

Identifiers

Citation

Pierre Fortin, Maxime Touche. Dual tree traversal on integrated GPUs for astrophysical N-body simulations. International Journal of High Performance Computing Applications, SAGE Publications, 2019, 33 (5), pp.960-972. ⟨10.1177/1094342019840806⟩. ⟨hal-02073710⟩

Share

Metrics

Record views

130

Files downloads

7