C. Alias, F. Baray, and A. Darte, Bee+Cl@k: An implementation of lattice-based array contraction in the source-to-source translator Rose, ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'07), pp.73-82, 2007.

U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan, A practical automatic polyhedral parallelizer and locality optimizer, ACM International Conference on Programming Languages Design and Implementation (PLDI'08), pp.101-113, 2008.

P. Boulet and P. Feautrier, Scanning polyhedra without Do-loops, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192), 1998.
DOI : 10.1109/PACT.1998.727127

URL : https://hal.archives-ouvertes.fr/inria-00564990

D. Chavarria-miranda and J. Mellor-crummey, Effective communication coalescing for data-parallel applications, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.14-25, 2005.
DOI : 10.1145/1065944.1065948

P. Coussy and A. Morawiec, High-Level Synthesis: From Algorithm to Digital Circuit, 2008.
DOI : 10.1007/978-1-4020-8588-8

A. Darte, R. Schreiber, B. R. Rau, and F. Vivien, Constructing and exploiting linear schedules with prescribed parallelism, ACM Transactions on Design Automation of Electronic Systems, vol.7, issue.1, pp.159-172, 2002.
DOI : 10.1145/504914.504921

URL : https://hal.archives-ouvertes.fr/hal-00807410

F. Eddy-de-greef, H. Catthoor, and . Man, Memory size reduction through storage order optimization for embedded parallel multimedia applications, Parallel Computing, vol.23, issue.12, pp.1811-1837, 1997.
DOI : 10.1016/S0167-8191(97)00089-6

H. Devos, J. Van-campenhout, and D. Stroobandt, Building an application-specific memory hierarchy on FPGA, Proceedings of the 2nd HiPEAC Workshop on Reconfigurable Computing, pp.53-62, 2008.

A. Fraboulet and T. Risset, Master Interface for On-chip Hardware Accelerator Burst Communications, The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol.7, issue.2, pp.73-85, 2007.
DOI : 10.1007/s11265-006-0045-2

URL : https://hal.archives-ouvertes.fr/hal-00391222

. Gaut, High-level synthesis tool from C to RTL

V. Lefebvre and P. Feautrier, Automatic storage management for parallel programs, Parallel Computing, vol.24, issue.3-4, pp.649-671, 1998.
DOI : 10.1016/S0167-8191(98)00029-5

. Nisc, No instruction-set computer (C-to-RTL)

N. D. Preeti-ranjan-panda, A. Dutt, and . Nicolau, Exploiting off-chip memory access modes in high-level synthesis, IEEE/ACM International Conference on Computer-Aided Design (ICCAD'97), pp.333-340, 1997.

J. Park and P. C. Diniz, Synthesis of pipelined memory access controllers for streamed data applications on FPGA-based computing engines, Proceedings of the 14th international symposium on Systems synthesis , ISSS '01, pp.221-226, 2001.
DOI : 10.1145/500001.500054

A. Plesco and T. Risset, Coupling loop transformations and high-level synthesis, SYMPosium en Architectures nouvelles de machines (SYMPA'08), 2008.
URL : https://hal.archives-ouvertes.fr/hal-00410724

F. Quilleré, S. Rajopadhye, and D. Wilde, Generation of efficient nested loops from polyhedra, International Journal of Parallel Programming, vol.28, issue.5, pp.469-498, 2000.
DOI : 10.1023/A:1007554627716

B. R. Rau, Iterative Modulo Scheduling, International Journal of Parallel Programming, vol.3, issue.3, pp.3-64, 1996.
DOI : 10.1007/BF03356742

URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.304.8660

G. Stitt, G. Chaudhari, and J. Coole, Traversal caches, Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis, CODES/ISSS '08, pp.61-66, 2008.
DOI : 10.1145/1450135.1450150

J. Xue, Loop Tiling for Parallelism, 2000.
DOI : 10.1007/978-1-4615-4337-4