Bee+Cl@k: An implementation of lattice-based array contraction in the source-to-source translator Rose, ACM SIGPLAN/SIGBED Conference on Languages, Compilers, and Tools for Embedded Systems (LCTES'07), pp.73-82, 2007. ,
A practical automatic polyhedral parallelizer and locality optimizer, ACM International Conference on Programming Languages Design and Implementation (PLDI'08), pp.101-113, 2008. ,
Scanning polyhedra without Do-loops, Proceedings. 1998 International Conference on Parallel Architectures and Compilation Techniques (Cat. No.98EX192), 1998. ,
DOI : 10.1109/PACT.1998.727127
URL : https://hal.archives-ouvertes.fr/inria-00564990
Effective communication coalescing for data-parallel applications, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.14-25, 2005. ,
DOI : 10.1145/1065944.1065948
High-Level Synthesis: From Algorithm to Digital Circuit, 2008. ,
DOI : 10.1007/978-1-4020-8588-8
Constructing and exploiting linear schedules with prescribed parallelism, ACM Transactions on Design Automation of Electronic Systems, vol.7, issue.1, pp.159-172, 2002. ,
DOI : 10.1145/504914.504921
URL : https://hal.archives-ouvertes.fr/hal-00807410
Memory size reduction through storage order optimization for embedded parallel multimedia applications, Parallel Computing, vol.23, issue.12, pp.1811-1837, 1997. ,
DOI : 10.1016/S0167-8191(97)00089-6
Building an application-specific memory hierarchy on FPGA, Proceedings of the 2nd HiPEAC Workshop on Reconfigurable Computing, pp.53-62, 2008. ,
Master Interface for On-chip Hardware Accelerator Burst Communications, The Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol.7, issue.2, pp.73-85, 2007. ,
DOI : 10.1007/s11265-006-0045-2
URL : https://hal.archives-ouvertes.fr/hal-00391222
High-level synthesis tool from C to RTL ,
Automatic storage management for parallel programs, Parallel Computing, vol.24, issue.3-4, pp.649-671, 1998. ,
DOI : 10.1016/S0167-8191(98)00029-5
No instruction-set computer (C-to-RTL) ,
Exploiting off-chip memory access modes in high-level synthesis, IEEE/ACM International Conference on Computer-Aided Design (ICCAD'97), pp.333-340, 1997. ,
Synthesis of pipelined memory access controllers for streamed data applications on FPGA-based computing engines, Proceedings of the 14th international symposium on Systems synthesis , ISSS '01, pp.221-226, 2001. ,
DOI : 10.1145/500001.500054
Coupling loop transformations and high-level synthesis, SYMPosium en Architectures nouvelles de machines (SYMPA'08), 2008. ,
URL : https://hal.archives-ouvertes.fr/hal-00410724
Generation of efficient nested loops from polyhedra, International Journal of Parallel Programming, vol.28, issue.5, pp.469-498, 2000. ,
DOI : 10.1023/A:1007554627716
Iterative Modulo Scheduling, International Journal of Parallel Programming, vol.3, issue.3, pp.3-64, 1996. ,
DOI : 10.1007/BF03356742
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.304.8660
Traversal caches, Proceedings of the 6th IEEE/ACM/IFIP international conference on Hardware/Software codesign and system synthesis, CODES/ISSS '08, pp.61-66, 2008. ,
DOI : 10.1145/1450135.1450150
Loop Tiling for Parallelism, 2000. ,
DOI : 10.1007/978-1-4615-4337-4