. Average,

, REFERENCES

M. L. Anido, A. Paar, and N. Bagherzadeh, Improving the operation autonomy of SIMD processing elements by using guarded instructions and pseudo branches, Proceedings Euromicro Symposium on Digital System Design. Architectures, Methods and Tools, pp.148-155, 2002.
DOI : 10.1109/DSD.2002.1115363

F. Bonomi, R. Milito, J. Zhu, and S. Addepalli, Fog computing and its role in the internet of things, Proceedings of the first edition of the MCC workshop on Mobile cloud computing, MCC '12, pp.13-16, 2012.
DOI : 10.1145/2342509.2342513

F. Bouwens, M. Berekovic, A. Kanstein, and G. Gaydadjiev, Architectural Exploration of the ADRES Coarse-Grained Reconfigurable Array, Proceedings of the 3rd International Conference on Reconfigurable Computing: Architectures, Tools and Applications, p.7, 2007.
DOI : 10.1007/978-3-540-71431-6_1

F. Campi, R. König, M. Dreschmann, M. Neukirchner, D. Picard et al., RTL-to-layout implementation of an embedded coarse grained architecture for dynamically reconfigurable computing in systems-on-chip, 2009 International Symposium on System-on-Chip, pp.110-113, 2009.
DOI : 10.1109/SOCC.2009.5335665

URL : https://hal.archives-ouvertes.fr/hal-00518128

K. Chang and K. Choi, Mapping control intensive kernels onto coarsegrained reconfigurable array architecture, International SoC Design Conference, pp.362-365, 2008.

L. Chen and T. Mitra, Graph minor approach for application mapping on cgras, ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol.7, issue.3, p.21, 2014.

J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, and G. Reinman, CHARM, Proceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design, ISLPED '12, pp.379-384, 2012.
DOI : 10.1145/2333660.2333747

S. Das, K. J. Martin, P. Coussy, D. Rossi, and L. Benini, Efficient mapping of CDFG onto coarse-grained reconfigurable array architectures, 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), pp.127-132, 2017.
DOI : 10.1109/ASPDAC.2017.7858308

URL : https://hal.archives-ouvertes.fr/hal-01452277

S. Das, T. Peyret, K. Martin, G. Corre, M. Thevenin et al., A Scalable Design Approach to Efficiently Map Applications on CGRAs, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp.655-660, 2016.
DOI : 10.1109/ISVLSI.2016.54

URL : https://hal.archives-ouvertes.fr/hal-01347764

S. Das, D. Rossi, K. Martin, P. Coussy, and L. Benini, A 142 mops/mw integrated programmable array accelerator for smart visual processing, 2017 IEEE International Symposium of Circuits and Systems (ISCAS), p.page Accepted, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01534574

B. D. Sutter, P. Raghavan, and A. Lambrechts, Coarse-Grained Reconfigurable Array Architectures, pp.449-484, 2010.

M. Dehyadegari, A. Marongiu, M. R. Kakoee, S. Mohammadi, N. Yazdani et al., Architecture Support for Tightly-Coupled Multi-Core Clusters with Shared-Memory HW Accelerators, IEEE Transactions on Computers, vol.64, issue.8, pp.2132-2144, 2015.
DOI : 10.1109/TC.2014.2360522

G. Donohoe, Reconfigurable data path processor, US Patent, vol.6883, p.84, 2005.

G. W. Donohoe, D. M. Buehler, K. J. Hass, W. Walker, and P. Yeh, Field Programmable Processor Array: Reconfigurable Computing for Space, 2007 IEEE Aerospace Conference, pp.1-6, 2007.
DOI : 10.1109/AERO.2007.353105

L. Duch, S. Basu, R. Braojos, G. Ansaloni, L. Pozzi et al., HEAL-WEAR: An Ultra-Low Power Heterogeneous System for Bio-Signal Analysis, IEEE Transactions on Circuits and Systems I: Regular Papers, vol.64, issue.9, pp.2448-2461, 2017.
DOI : 10.1109/TCSI.2017.2701499

M. Gautschi, P. D. Schiavone, A. Traber, I. Loi, A. Pullini et al., Near-Threshold RISC-V Core With DSP Extensions for Scalable IoT Endpoint Devices, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, 2017.
DOI : 10.1109/TVLSI.2017.2654506

M. Gautschi, A. Traber, A. Pullini, L. Benini, M. Scandale et al., Tailoring instruction-set extensions for an ultra-low power tightly-coupled cluster of OpenRISC cores, 2015 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), pp.25-30, 2015.
DOI : 10.1109/VLSI-SoC.2015.7314386

S. C. Goldstein, H. Schmit, M. Budiu, S. Cadambi, M. Moe et al., PipeRench: a reconfigurable architecture and compiler, Computer, vol.33, issue.4, pp.70-77, 2000.
DOI : 10.1109/2.839324

M. Hamzeh, A. Shrivastava, and S. Vrudhula, REGIMap, Proceedings of the 50th Annual Design Automation Conference on, DAC '13, p.18, 2013.
DOI : 10.1145/2463209.2488756

K. Han, J. K. Paek, and K. Choi, Acceleration of control flow on CGRA using advanced predicated execution, 2010 International Conference on Field-Programmable Technology, 2010.
DOI : 10.1109/FPT.2010.5681452

K. Han, S. Park, and K. Choi, State-based full predication for low power coarse-grained reconfigurable architecture, 2012 Design, Automation Test in Europe Conference Exhibition (DATE), 2012.

T. Instruments, Tms320c64x/c64x+ dsp cpu and instruction set reference guide. Texas Instruments, 2005.

C. Kim, M. Chung, Y. Cho, M. Konijnenburg, S. Ryu et al., Ulpsrp: Ultra low-power samsung reconfigurable processor for biomedical applications, ACM Transactions on Reconfigurable Technology and Systems (TRETS), vol.7, issue.3, p.22, 2014.
DOI : 10.1109/fpt.2012.6412157

Y. Kim, M. Kiemb, C. Park, J. Jung, and K. Choi, Resource sharing and pipelining in coarse-grained reconfigurable architecture for domainspecific optimization, Design, Automation and Test in Europe, pp.12-17, 2005.
URL : https://hal.archives-ouvertes.fr/hal-00181488

D. Kissler, A. Strawetz, F. Hannig, and J. Teich, Power-Efficient Reconfiguration Control in Coarse-Grained Dynamically Reconfigurable Architectures, Journal of Low Power Electronics, vol.5, issue.1, pp.96-105, 2009.
DOI : 10.1166/jolpe.2009.1008

URL : http://www12.informatik.uni-erlangen.de/publications/pub2008/ksht08.pdf

D. Lampret, C. Chen, M. Mlinar, J. Rydberg, M. Ziv-av et al., Openrisc 1000 architecture manual. Opencores, 2003.

J. Lee, S. Seo, H. Lee, and H. U. Sim, Flattening-based mapping of imperfect loop nests for CGRAs, Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis, CODES '14, pp.1-10, 2014.
DOI : 10.1145/2656075.2656085

G. Levi, A note on the derivation of maximal common subgraphs of two directed or undirected graphs, Calcolo, vol.3, issue.4, pp.341-352, 1973.
DOI : 10.1007/BF02575586

C. Liang and X. Huang, SmartCell: An Energy Efficient Coarse-Grained Reconfigurable Architecture for Stream-Based Applications, EURASIP Journal on Embedded Systems, vol.13, issue.1, p.518659, 2009.
DOI : 10.1016/j.micpro.2006.02.009

C. Liu, H. Ng, and H. K. So, Automatic nested loop acceleration on fpgas using soft CGRA overlay, Proceedings of the Second International Workshop on FPGAs for Software Programmers (FSP), pp.13-18, 2015.

D. Liu, S. Yin, L. Liu, and S. Wei, Polyhedral model based mapping optimization of loop nests for CGRAs, Proceedings of the 50th Annual Design Automation Conference on, DAC '13, pp.1-8, 2013.
DOI : 10.1145/2463209.2488757

L. Liu, J. Wang, J. Zhu, C. Deng, S. Yin et al., TLIA: Efficient Reconfigurable Architecture for Control-Intensive Kernels with Triggered-Long-Instructions, IEEE Transactions on Parallel and Distributed Systems, vol.27, issue.7, pp.2143-2154, 2016.
DOI : 10.1109/TPDS.2015.2477841

J. Lopes, D. Sousa, and J. C. Ferreira, Evaluation of CGRA architecture for real-time processing of biological signals on wearable devices, 2017 International Conference on ReConFigurable Computing and FPGAs (ReConFig), pp.1-7, 2017.
DOI : 10.1109/RECONFIG.2017.8279789

K. T. Madhu, S. Das, N. Sivanandan, S. K. Nandy, and R. Narayan, Compiling HPC Kernels for the REDEFINE CGRA, 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems, pp.405-410, 2015.
DOI : 10.1109/HPCC-CSS-ICESS.2015.139

A. Marshall, T. Stansfield, I. Kostarnov, J. Vuillemin, and B. Hutchings, A reconfigurable arithmetic array for multimedia applications, Proceedings of the 1999 ACM/SIGDA seventh international symposium on Field programmable gate arrays , FPGA '99, pp.135-143, 1999.
DOI : 10.1145/296399.296444

K. Masuyama, Y. Fujita, H. Okuhara, and H. Amano, A 297mops/0.4 mw ultra low power coarse-grained reconfigurable accelerator CMA- SOTB-2, 2015 International Conference on ReConFigurable Computing and FPGAs (ReConFig), pp.1-6, 2015.

B. Mei, S. Vernalde, D. Verkest, H. De-man, and R. Lauwereins, Dresc: A retargetable compiler for coarse-grained reconfigurable architectures, Field-Programmable TechnologyFPT). Proceedings. 2002 IEEE International Conference on, pp.166-173, 2002.

E. Mirsky and A. Dehon, MATRIX: a reconfigurable computing architecture with configurable instruction distribution and deployable resources, Proceedings IEEE Symposium on FPGAs for Custom Computing Machines FPGA-96, pp.17-19, 1996.
DOI : 10.1109/FPGA.1996.564808

N. Ozaki, Y. Yoshihiro, Y. Saito, D. Ikebuchi, M. Kimura et al., Cool megaarray: A highly energy efficient reconfigurable accelerator, Field- Programmable Technology (FPT), 2011 International Conference on, pp.1-8, 2011.
DOI : 10.1109/fpt.2011.6132668

H. Park, K. Fan, S. A. Mahlke, T. Oh, H. Kim et al., Edgecentric modulo scheduling for coarse-grained reconfigurable architectures, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, pp.166-176, 2008.
DOI : 10.1145/1454115.1454140

URL : http://cccp.eecs.umich.edu/papers/parkhc-pact08.pdf

K. Patel, S. Mcgettrick, and C. J. Bleakley, SYSCORE: A Coarse Grained Reconfigurable Array Architecture for Low Energy Biosignal Processing, 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines, pp.109-112, 2011.
DOI : 10.1109/FCCM.2011.38

URL : http://researchrepository.ucd.ie/bitstream/10197/7033/1/SYSCORE_A_Coarse_Grained_Reconfigurable_Array_Architecture_for_Low_Energy_Biosignal_Processing.pdf

P. G. Paulin and J. P. Knight, Force-directed scheduling for the behavioral synthesis of ASICs, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.8, issue.6, pp.661-679, 1989.
DOI : 10.1109/43.31522

T. Peyret, G. Corre, M. Thevenin, K. Martin, and P. Coussy, Efficient application mapping on CGRAs based on backward simultaneous scheduling/binding and dynamic graph transformations, 2014 IEEE 25th International Conference on Application-Specific Systems, Architectures and Processors, pp.169-172, 2014.
DOI : 10.1109/ASAP.2014.6868652

URL : https://hal.archives-ouvertes.fr/hal-01009486

A. Rahimi, I. Loi, M. R. Kakoee, and L. Benini, A fully-synthesizable single-cycle interconnection network for Shared-L1 processor clusters, 2011 Design, Automation & Test in Europe, pp.1-6, 2011.
DOI : 10.1109/DATE.2011.5763085

Z. E. Rakossy, A. Acosta-aponte, T. G. Noll, G. Ascheid, R. Leupers et al., Design and synthesis of reconfigurable controlflow structures for cgra, 2015 International Conference on ReCon- Figurable Computing and FPGAs (ReConFig), pp.1-8, 2015.

Z. E. Rákossy, D. Stengele, G. Ascheid, R. Leupers, and A. Chattopadhyay, Exploiting scalable CGRA mapping of LU for energy efficiency using the Layers architecture, 2015 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), pp.337-342, 2015.
DOI : 10.1109/VLSI-SoC.2015.7314440

D. Rossi, A. Pullini, I. Loi, M. Gautschi, F. K. Gürkaynak et al., A 60 GOPS/W, -1.8 V to 0.9 V body bias ULP cluster in 28 nm UTBB fd-soi technology. Solid-State Electronics, pp.170-184, 2016.
DOI : 10.1016/j.sse.2015.11.015

Y. Saito, T. Sano, M. Kato, V. Tunbunheng, Y. Yasuda et al., Muccra-3: a low power dynamically reconfigurable processor array, Proceedings of the 2010 Asia and South Pacific Design Automation Conference, pp.377-378, 2010.

H. Singh, M. Lee, G. Lu, F. J. Kurdahi, N. Bagherzadeh et al., MorphoSys, Proceedings of the 37th conference on Design automation , DAC '00, pp.465-481, 2000.
DOI : 10.1145/337292.337583

Y. Song and Y. Lin, Unroll-and-jam for imperfectly-nested loops in DSP applications, Proceedings of the international conference on Compilers, architectures, and synthesis for embedded systems , CASES '00, pp.148-156, 2000.
DOI : 10.1145/354880.354901

H. Su, Y. Fujita, and H. Amano, Body bias control for a coarse grained reconfigurable accelerator implemented with Silicon on Thin BOX technology, 2014 24th International Conference on Field Programmable Logic and Applications (FPL), pp.1-6, 2014.
DOI : 10.1109/FPL.2014.6927486

S. Yin, P. Zhou, L. Liu, and S. Wei, Acceleration of nested conditionals on CGRAs via trigger scheme, 2015 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 2015.
DOI : 10.1109/ICCAD.2015.7372624

Z. Yu, M. J. Meeuwsen, R. W. Apperson, O. Sattari, M. Lai et al., AsAP: An Asynchronous Array of Simple Processors, IEEE Journal of Solid-State Circuits, vol.43, issue.3, pp.43695-705, 2008.
DOI : 10.1109/JSSC.2007.916616