Efficient Abstractions for GPGPU Programming

Mathias Bourgoin 1 Emmanuel Chailloux 1 Jean-Luc Lamotte 2
1 APR - Algorithmes, Programmes et Résolution
LIP6 - Laboratoire d'Informatique de Paris 6
2 PEQUAN - Performance et Qualité des Algorithmes Numériques
LIP6 - Laboratoire d'Informatique de Paris 6
Abstract : General purpose (GP)GPU programming demands to couple highly parallel computing units with classic CPUs to obtain a high performance. Heterogenous systems lead to complex designs combining multiple paradigms and programming languages to manage each hardware architecture. In this paper, we present tools to harness GPGPU programming through the high-level OCaml programming language. We describe the SPOC library that allows to handle GPGPU subprograms (kernels) and data transfers between devices. We then present how SPOC expresses GPGPU kernel: through interoperability with common low-level extensions (from Cuda and OpenCL frameworks) but also via an embedded DSL for OCaml. Using simple benchmarks as well as a real world HPC software, we show that SPOC can offer a high performance while efficiently easing development. To allow better abstractions over tasks and data, we introduce some parallel skeletons built upon SPOC as well as composition constructs over those skeletons.
Type de document :
Article dans une revue
International Journal of Parallel Programming, Springer Verlag, 2014, 42 (4), pp.583-600. 〈10.1007/s10766-013-0261-x〉
Liste complète des métadonnées

Contributeur : Lip6 Publications <>
Soumis le : lundi 27 avril 2015 - 17:53:41
Dernière modification le : vendredi 7 décembre 2018 - 01:28:28




Mathias Bourgoin, Emmanuel Chailloux, Jean-Luc Lamotte. Efficient Abstractions for GPGPU Programming. International Journal of Parallel Programming, Springer Verlag, 2014, 42 (4), pp.583-600. 〈10.1007/s10766-013-0261-x〉. 〈hal-01146170〉



Consultations de la notice