C. Augonnet, S. Thibault, R. Namyst, and P. A. Wacrenier, Starpu: A unified platform for task scheduling on heterogeneous multicore architectures, Concurr. Comput. : Pract. Exper, vol.23, issue.2, pp.187-198, 2011.
URL : https://hal.archives-ouvertes.fr/inria-00384363

O. Aumage, J. Bigot, K. Ejjaaouani, and M. Mehrenberger, Inks, a programming model to decouple performance from semantics in simulation codes, 2017.
URL : https://hal.archives-ouvertes.fr/cea-01493075

D. H. Bailey, E. Barszcz, J. T. Barton, D. S. Browning, R. L. Carter et al., The nas parallel benchmarks, The International Journal of Supercomputing Applications, vol.5, issue.3, pp.63-73, 1991.

C. Edwards, H. Trott, C. R. Sunderland, and D. , Kokkos. J. Parallel Distrib. Comput, vol.74, issue.12, pp.3202-3216, 2014.

R. Chandra, L. Dagum, D. Kohr, D. Maydan, J. Mcdonald et al., Parallel Programming in OpenMP, 2001.

M. Christen, O. Schenk, and H. Burkhart, PATUS: A Code Generation and Autotuning Framework for Parallel Iterative Stencil Computations on Modern Microarchitectures, Parallel & Distributed Processing Symposium (IPDPS), 2011 IEEE International, pp.676-687, 2011.

M. Cosnard and E. Jeannot, Compact dag representation and its dynamic scheduling, Journal of Parallel and Distributed Computing, vol.58, issue.3, pp.487-514, 1999.
URL : https://hal.archives-ouvertes.fr/inria-00098841

T. El-ghazawi, W. Carlson, T. Sterling, and K. Yelick, UPC: Distributed Shared Memory Programming

. Wiley-interscience, , 2005.

D. Griebler, J. Lff, L. Fernandes, G. Mencagli, and M. Danelutto, Efficient nas benchmark kernels with c++ parallel programming, 2018.

M. Höhnerbach, A. E. Ismail, and P. Bientinesi, The vectorization of the tersoff multibody potential: An exercise in performance portability, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis. pp. 7:1-7:13. SC '16, 2016.

R. Hoque, T. Herault, G. Bosilca, and J. Dongarra, Dynamic task discovery in parsec: A data-flow task-based runtime, Proceedings of the 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems. pp. 6:1-6:8. ScalA '17, 2017.

S. Kamil, Stencilprobe: A microbenchmark for stencil applications, pp.2017-2025, 2012.

K. Kormann, K. Reuter, M. Rampp, and E. Sonnendrcker, Massively parallel semilagrangian solution of the 6d vlasov-poisson problem, 2016.

J. Lee and M. Sato, Implementation and performance evaluation of xcalablemp: A parallel programming language for distributed memory systems, 2010 39th International Conference on Parallel Processing Workshops, pp.413-420, 2010.

M. Mehrenberger, C. Steiner, L. Marradi, N. Crouseilles, E. Sonnendrücker et al., Vlasov on gpu (vog project)******. ESAIM: Proc, vol.43, pp.37-58, 2013.
DOI : 10.1051/proc/201343003
URL : https://hal.archives-ouvertes.fr/hal-00908498

M. Steuwer, T. Remmelg, and C. Dubach, Lift: A functional data-parallel ir for high-performance gpu code generation, 2017 IEEE/ACM International Symposium on Code Generation and Optimization (CGO), pp.74-85, 2017.

Y. Tang, R. A. Chowdhury, B. C. Kuszmaul, C. K. Luk, and C. E. Leiserson, The pochoir stencil compiler, Proceedings of the Twenty-third Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp.117-128, 2011.

H. Tanno and H. Iwasaki, Parallel skeletons for variable-length lists in sketo skeleton library, Proceedings of the 15th International Euro-Par Conference on Parallel Processing, pp.666-677, 2009.