C. Augonnet, S. Thibault, and R. Namyst, StarPU: a Runtime System for Scheduling Tasks over Accelerator-Based Multicore Machines, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00467677

B. Bhattacharya and S. Bhattacharyya, Parameterized dataflow modeling for DSP systems, IEEE Transactions on Signal Processing, vol.49, issue.10, pp.2408-2421, 2001.
DOI : 10.1109/78.950795

J. Bueno, L. Martinell, A. Duran, M. Farreras, X. Martorell et al., Productive Cluster Programming with OmpSs, Proceedings of the 17th International Conference on Parallel Processing -Volume Part I, pp.555-566, 2011.
DOI : 10.1147/rd.515.0593

R. Chandra, L. Dagum, D. Kohr, D. Maydan, J. Mcdonald et al., Parallel Programming in OpenMP, 2001.

W. Y. Chen, Optimizing Partitioned Global Address Space Programs for Cluster Architectures, 2007.

M. Flynn, Some Computer Organizations and Their Effectiveness, IEEE Transactions on Computers, vol.21, issue.9, pp.948-960, 1972.
DOI : 10.1109/TC.1972.5009071

L. Itti, C. Koch, and E. Niebur, A model of saliency-based visual attention for rapid scene analysis, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.20, issue.11, pp.1254-1259, 1998.
DOI : 10.1109/34.730558

D. B. Kirk and W. M. Hwu, Programming Massively Parallel Processors: A Handson Approach, 2010.

E. Lee and D. Messerschmitt, Static scheduling of synchronous data flow programs for digital signal processing. Computers, IEEE Transactions on C, vol.36, issue.1, pp.24-35, 1987.

P. S. Pacheco, Parallel Programming with MPI, 1996.

K. Parhi and D. Messerschmitt, Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding. Computers, IEEE Transactions on, vol.40, issue.2, pp.178-195, 1991.

J. Reinders, Intel threading building blocks -outfitting C++ for multi-core processor parallelism, 2007.

J. Sanders and E. Kandrot, CUDA by Example: An Introduction to General-Purpose GPU Programming, 2010.