Y. Lecun, Y. Bengio, and G. Hinton, Deep learning, Nature, vol.9, issue.7553, pp.436-444, 2015.
DOI : 10.1007/s10994-013-5335-x

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh et al., ImageNet Large Scale Visual Recognition Challenge, International Journal of Computer Vision, vol.1010, issue.1, pp.211-252, 2015.
DOI : 10.1007/978-3-642-15555-0_11

URL : http://dspace.mit.edu/bitstream/1721.1/104944/1/11263_2015_Article_816.pdf

J. Long, E. Shelhamer, and T. Darrell, Fully convolutional networks for semantic segmentation, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.3431-3440, 2015.
DOI : 10.1109/CVPR.2015.7298965

URL : http://arxiv.org/pdf/1411.4038

Y. Zhang, M. Pezeshki, P. Brakel, S. Zhang, C. L. et al., Towards end-to-end speech recognition with deep convolutional neural networks. arXiv preprint
DOI : 10.21437/interspeech.2016-1446

URL : http://arxiv.org/pdf/1701.02720

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition . arXiv preprint, pp.1-14, 2014.

E. Nurvitadhi, S. Subhaschandra, G. Boudoukh, G. Venkatesh, J. Sim et al., Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '17, pp.5-14, 2017.
DOI : 10.1109/MICRO.2014.58

J. Qiu, J. Wang, S. Yao, K. Guo, B. Li et al., Going Deeper with Embedded FPGA Platform for Convolutional Neural Network, Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '16, pp.26-35, 2016.
DOI : 10.1109/92.784091

F. Intel, Intel® Stratix® 10 Variable Precision DSP Blocks User Guide, 2017.

G. Lacey, G. W. Taylor, and S. Areibi, Deep Learning on FPGAs: Past, Present, and Future. arXiv e-print, p.2016

V. Sze, Y. Chen, T. Yang, and J. Emer, EEcient Processing of Deep Neural Networks: A Tutorial and Survey, Proceedings of the IEEE, pp.2295-2329

Y. Lecun, Y. Bottou, P. Bengio, and . Haaner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, pp.2278-2324, 1998.
DOI : 10.1109/5.726791

URL : http://www.cs.berkeley.edu/~daf/appsem/Handwriting/papers/00726791.pdf

A. Krizhevsky, I. Sutskever, H. Geoorey, E. Geoorey, and E. Hinton, ImageNet Classiication with Deep Convolutional Neural Networks, Advances in Neural Information Processing Systems -NIPS'12, pp.1-9, 2012.
DOI : 10.1145/3065386

URL : http://dl.acm.org/ft_gateway.cfm?id=3065386&type=pdf

H. David, . Hubel, N. Torsten, and . Wiesel, Receptive elds, binocular interaction and functional architecture in the cat's visual cortex, The Journal of physiology, vol.160, issue.1, pp.106-154, 1962.

S. Iooe and C. Szegedy, Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, Proceedings of the International Conference on Machine Learning -ICML '15, pp.448-456, 2015.

M. Courbariaux, I. Hubara, and D. Soudry, Ran El-Yaniv, and Yoshua Bengio, Binarized Neural Networks: Training Deep Neural Networks with Weights and Activations Constrained to +1 or -1. arXiv e-print, p.2016

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed et al., Going deeper with convolutions, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015.
DOI : 10.1109/CVPR.2015.7298594

URL : http://arxiv.org/pdf/1409.4842

K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.770-778
DOI : 10.1109/CVPR.2016.90

J. Cong and B. Xiao, Minimizing Computation in Convolutional Neural Networks, Proceedings of the International Conference on Artiicial Neural Networks -ICANN '14, pp.281-290, 2014.
DOI : 10.1007/978-3-319-11179-7_36

G. Richard and . Shoup, Parameterized convolution ltering in a eld programmable gate array, Proceedings of the International Workshop on Field Programmable Logic and Applications on More FPGAs, pp.274-280, 1994.

M. Horowitz, 1.1 Computing's energy problem (and what we can do about it), 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC), pp.10-14
DOI : 10.1109/ISSCC.2014.6757323

URL : https://hal.archives-ouvertes.fr/insu-01488608

. Nvidia, GPU-Based Deep Learning Inference: A Performance and Power Analysis, 2015.

S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran et al., cuDNN: EEcient Primitives for Deep Learning, 2014.

H. Perkins, Deep CL: OpenCL library to train deep convolutional neural networks, 2017.

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long et al., Caae: Convolutional Architecture for Fast Feature Embedding, Proceedings of the ACM International Conference on Multimedia, 2014.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis et al., TensorFlow: A System for Large-Scale Machine Learning, Proceedings of the USENIX Symposium on Operating Systems Design and Implementation -OSDI '16, pp.265-284, 2016.

N. Suda, V. Chandra, G. Dasika, A. Mohanty, Y. Ma et al., Throughput-Optimized OpenCL-based FPGA Accelerator for Large-Scale Convolutional Neural Networks, Proceedings of the 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '16, pp.16-25, 2016.
DOI : 10.1145/2664666.2664670

U. Aydonat, O. Shane, D. Connell, A. C. Capalija, G. R. Ling et al., An OpenCL(TM) Deep Learning Accelerator on Arria 10, Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays -FPGA '17, pp.55-64, 2017.

R. Dicecco, G. Lacey, J. Vasiljevic, P. Chow, G. Taylor et al., Caffeinated FPGAs: FPGA framework For Convolutional Neural Networks, 2016 International Conference on Field-Programmable Technology (FPT), 2016.
DOI : 10.1109/FPT.2016.7929549

C. Zhang and V. Prasanna, Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '17, pp.35-44, 2017.
DOI : 10.1145/2847263.2847276

B. Jong-hwan-ko, T. Ahmad-mudassar, S. Na, and . Mukhopadhyay, Design of an Energy- EEcient Accelerator for Training of Convolutional Neural Networks using Frequency-Domain Computation, Proceedings of the Annual Conference on Design Automation -DAC '17, 2017.

I. Stylianos, C. S. Venieris, and . Bouganis, FpgaConvNet: A Framework for Mapping Convolutional Neural Networks on FPGAs, Proceedings of the IEEE Annual International Symposium on Field- Programmable Custom Computing Machines -FCCM '16, pp.40-47, 2016.

H. Sharma, J. Park, D. Mahajan, E. Amaro, J. K. Kim et al., From high-level deep neural models to FPGAs, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp.1-12, 2016.
DOI : 10.1109/MICRO.2016.7783720

H. Li, X. Fan, L. Jiao, W. Cao, X. Zhou et al., A high performance FPGA-based accelerator for large-scale convolutional neural networks, Proceedings of the International Conference on Field Programmable Logic and Applications -FPL '16, pp.1-9

G. Natale, M. Bacis, and M. D. Santambrogio, On How to Design Dataaow FPGA- Based Accelerators for Convolutional Neural Networks, Proceedings of the IEEE Computer Society Annual Symposium on VLSI -ISVLSI' 17, pp.639-644, 2017.

K. Abdelouahab, M. Pelcat, J. Serot, C. Bourrasset, and F. Berry, Tactics to Directly Map CNN Graphs on Embedded FPGAs, IEEE Embedded Systems Letters, vol.9, issue.4, pp.1-4, 2017.
DOI : 10.1109/LES.2017.2743247

URL : https://hal.archives-ouvertes.fr/hal-01626462

C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao et al., Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks, Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '15, pp.161-170, 2015.
DOI : 10.1145/1498765.1498785

M. Motamedi, P. Gysel, V. Akella, and S. Ghiasi, Design space exploration of FPGA-based Deep Convolutional Neural Networks, 2016 21st Asia and South Pacific Design Automation Conference (ASP-DAC), pp.575-580
DOI : 10.1109/ASPDAC.2016.7428073

P. Meloni, G. Deriu, F. Conti, I. Loi, L. Raao et al., Curbing the Roooine : a Scalable and Flexible Architecture for CNNs on FPGA, Proceedings of the ACM International Conference on Computing Frontiers -CF '16, pp.376-383, 2016.

M. Motamedi, P. Gysel, and S. Ghiasi, PLACID, ACM Transactions on Multimedia Computing, Communications, and Applications, pp.1-62
DOI : 10.1145/2684746.2689060

X. Wei, C. Hao-yu, P. Zhang, Y. Chen, Y. Wang et al., Automated Systolic Array Architecture Synthesis for High Throughput CNN Inference on FPGAs, Proceedings of the 54th Annual Design Automation Conference 2017 on , DAC '17, pp.1-6, 2017.
DOI : 10.1145/1815961.1815993

T. Fujii, S. Sato, H. Nakahara, and M. Motomura, An FPGA Realization of a Deep Convolutional Neural Network Using a Threshold Neuron Pruning, Proceedings of the International Symposium on Applied Reconngurable Computing -ARC'16, pp.268-280, 2017.
DOI : 10.1145/2684746.2689060

S. Gupta, A. Agrawal, P. Narayanan, K. Gopalakrishnan, and P. Narayanan, Deep Learning with Limited Numerical Precision, Proceedings of the International Conference on Machine Learning -ICML '15, pp.1737-1746, 2015.

S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen et al., DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients, 2016.

M. Courbariaux, Y. Bengio, and J. David, Training deep neural networks with low precision multiplications. arXiv e-print, p.2014

P. Gysel, M. Motamedi, and S. Ghiasi, Hardware-oriented Approximation of Convolutional Neural Networks, arXiv preprint, p.8, 2016.

M. Courbariaux, Y. Bengio, and J. David, BinaryConnect: Training Deep Neural Networks with binary weights during propagations, Advances in Neural Information Processing Systems - NIPS'15, pp.3123-3131, 2015.

M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks, Proceedings of the European Conference on Computer Vision -ECCV'16, pp.525-542, 2016.
DOI : 10.1103/PhysRevLett.115.128101

URL : http://arxiv.org/pdf/1603.05279

Y. Umuroglu, J. Nicholas, G. Fraser, M. Gambardella, P. Blott et al., FINN, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '17, pp.65-74, 2017.
DOI : 10.1145/1498765.1498785

URL : http://arxiv.org/pdf/1612.07119

R. Andri, L. Cavigelli, D. Rossi, and L. Benini, YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights, 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), pp.2016-236, 2016.
DOI : 10.1109/ISVLSI.2016.111

R. Zhao, W. Ouyang, H. Li, and X. Wang, Saliency detection by multi-context deep learning, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.1265-1274, 2015.
DOI : 10.1109/CVPR.2015.7298731

H. Sim and J. Lee, A New Stochastic Computing Multiplier with Application to Deep Convolutional Neural Networks, Proceedings of the 54th Annual Design Automation Conference 2017 on , DAC '17, pp.1-6, 2017.
DOI : 10.1109/ASPDAC.2017.7858405

T. Vincent, A. Lee, . Alaghi, P. John, V. Hayes et al., Energy-EEcient Hybrid Stochastic-Binary Neural Networks for Near-Sensor Computing, Proceedings of the Conference on Design, Automation and Test in Europe -DATE '17, 2017.

K. Kim, J. Kim, J. Yu, J. Seo, J. Lee et al., Dynamic Energy-accuracy Trade-oo Using Stochastic Computing in Deep Neural Networks, Proceedings of the Annual Conference on Design Automation -DAC '16, pp.1-124, 2016.

J. Zhang and J. Li, Improving the Performance of OpenCL-based FPGA Accelerator for Convolutional Neural Network, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '17, pp.25-34, 2017.
DOI : 10.1201/EBK1439811924

R. Tapiador, A. Rios-navarro, A. Linares-barranco, M. Kim, D. Kadetotad et al., Comprehensive Evaluation of OpenCL-based Convolutional Neural Network Accelerators in Xilinx and Altera FPGAs, Proceedings of the International Work-Conference on Artiicial Neural Networks-IWANN '17, pp.271-282, 2017.

J. Bottleson, S. Kim, J. Andrews, P. Bindu, D. N. Murthy et al., ClCaae: OpenCL accelerated caae for convolutional neural networks, Proceedings of the IEEE International Parallel and Distributed Processing Symposium -IPDPS '16, pp.50-57, 2016.
DOI : 10.1109/ipdpsw.2016.182

F. Intel, The Intel® FPGA SDK for Open Computing Language (OpenCL), 2016.

C. Zhang, D. Wu, J. Sun, G. Sun, G. Luo et al., Energy-EEcient CNN Implementation on a Deeply Pipelined FPGA Cluster, Proceedings of the International Symposium on Low Power Electronics and Design -ISLPED '16, pp.326-331, 2016.

C. Zhang, Z. Fang, P. Zhou, P. Pan, and J. Cong, Caaeine: Caaeine: Towards uniformed representation and acceleration for deep convolutional neural networks, Proceedings of the International Conference on Computer-Aided Design -ICCAD '16, pp.1-8, 2016.

H. Erwin and . Bareiss, Numerical solution of linear equations with Toeplitz and Vector Toeplitz matrices, Numerische Mathematik, vol.13, issue.10, pp.404-424

S. Winograd, Arithmetic complexity of computations, Siam, vol.33, 1980.
DOI : 10.1137/1.9781611970364

A. Lavin and S. Gray, Fast Algorithms for Convolutional Neural Networks. arXiv e-print, pp.150-2015
DOI : 10.1109/cvpr.2016.435

URL : http://arxiv.org/pdf/1509.09308

L. Lu, Y. Liang, Q. Xiao, and S. Yan, Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs, 2017 IEEE 25th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), pp.101-108, 2017.
DOI : 10.1109/FCCM.2017.64

S. Smith, The scientist and engineer's guide to digital signal processing. California Technical Pub, 1997.

M. Sankaradas, V. Jakkula, S. Cadambi, S. Chakradhar, I. Durdanovic et al., A Massively Parallel Coprocessor for Convolutional Neural Networks, 2009 20th IEEE International Conference on Application-specific Systems, Architectures and Processors, pp.53-60, 2009.
DOI : 10.1109/ASAP.2009.25

C. Farabet, C. Poulet, J. Han, Y. Lecun, D. R. Tobergte et al., CNP: An FPGA-based processor for Convolutional Networks, 2009 International Conference on Field Programmable Logic and Applications, pp.1689-1699, 2009.
DOI : 10.1109/FPL.2009.5272559

S. Chakradhar, M. Sankaradas, V. Jakkula, and S. Cadambi, A dynamically configurable coprocessor for convolutional neural networks, ACM SIGARCH Computer Architecture News, vol.38, issue.3, pp.247-257
DOI : 10.1145/1816038.1815993

C. Farabet, . Martini, . Corda, . Akselrod, Y. Culurciello et al., NeuFlow: A runtime reconngurable dataaow processor for vision, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition -CVPR '11, pp.109-116

V. Gokhale, J. Jin, A. Dundar, B. Martini, and E. Culurciello, A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp.696-701
DOI : 10.1109/CVPRW.2014.106

R. Atul, L. Jongeun, and C. Kiyoung, EEcient FPGA acceleration of Convolutional Neural Networks using logical-3D compute array, Proceedings of the Conference on Design, Automation and Test in Europe -DATE '16, 2016.

M. Alwani, H. Chen, M. Ferdman, and P. Milder, Fused-layer CNN accelerators, 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), 2016.
DOI : 10.1109/MICRO.2016.7783725

Y. Ma, Y. Cao, S. Vrudhula, and J. Seo, Optimizing Loop Operation and Dataaow in FPGA Acceleration of Deep Convolutional Neural Networks, Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays -FPGA '17, pp.45-54, 2017.

S. Derrien and S. Rajopadhye, Loop tiling for reconngurable accelerators, Proceedings of the International Conference on Field Programmable Logic and Applications -FPL '01, pp.398-408, 2001.
DOI : 10.1007/3-540-44687-7_41

S. Williams, A. Waterman, and D. Patterson, Roofline, Communications of the ACM, vol.52, issue.4, p.65, 2009.
DOI : 10.1145/1498765.1498785

Y. Ma, M. Kim, Y. Cao, S. Vrudhula, and J. Seo, End-to-end scalable FPGA accelerator for deep residual networks, 2017 IEEE International Symposium on Circuits and Systems (ISCAS), pp.1-4
DOI : 10.1109/ISCAS.2017.8050344

Y. Ma, Y. Cao, S. Vrudhula, and J. Seo, An automatic RTL compiler for high-throughput FPGA implementation of diverse deep convolutional neural networks, 2017 27th International Conference on Field Programmable Logic and Applications (FPL), pp.1-8
DOI : 10.23919/FPL.2017.8056824

Z. Liu, Y. Dou, J. Jiang, J. Xu, S. Li et al., Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks, ACM Transactions on Reconfigurable Technology and Systems, vol.10, issue.3, pp.1-23, 2017.
DOI : 10.1145/2684746.2689060

B. Jack, . Dennis, P. David, and . Misunas, A Preliminary Architecture for a Basic Data--ow Processor, Proceedings of the International Symposium on Computer Architecture -ISCA '75, pp.126-132, 1975.

L. Lin, T. Fanni, T. Viitanen, X. Renjie, F. Palumbo et al., Low power design methodology for signal processing systems using lightweight dataaow techniques, Proceedings of the Conference on Design and Architectures for Signal and Image Processing -DASIP' 16, pp.82-89, 2016.

C. Shen, W. Plishker, H. Wu, S. Shuvra, and . Bhattacharyya, A lightweight dataaow approach for design and implementation of SDR systems, Proceedings of the Wireless Innovation Conference and Product Exposition, pp.640-645, 2010.

A. Edward, . Lee, G. David, and . Messerschmitt, Synchronous data ow, Proceedings of the IEEE, 1987.

I. Stylianos, C. S. Venieris, and . Bouganis, Latency-Driven Design for FPGA-based Convolutional Neural Networks, Proceedings of the International Conference on Field Programmable Logic and Applications -FPL '17, 2017.

S. Mittal, A Survey of Techniques for Approximate Computing, ACM Computing Surveys, vol.48, issue.4, pp.1-33
DOI : 10.1109/DAC.2014.6881426

S. Anwar, K. Hwang, and W. Sung, Fixed point optimization of deep convolutional neural networks for object recognition, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.
DOI : 10.1109/ICASSP.2015.7178146

D. Lin, S. Talathi, and V. Annapureddy, Fixed Point Quantization of Deep Convolutional Networks, Proceedings of the International Conference on Machine Learning -ICML '16, pp.2849-2858, 2016.

I. Hubara, M. Courbariaux, and D. Soudry, Ran El-Yaniv, and Yoshua Bengio, Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations. arxiv e-print, p.2016

S. Zhou, Y. Wang, H. Wen, Q. He, and Y. Zou, Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks, Journal of Computer Science and Technology, vol.115, issue.3, pp.667-682, 2017.
DOI : 10.1007/978-94-010-0201-1_1

J. Wu, C. Leng, Y. Wang, Q. Hu, and J. Cheng, Quantized Convolutional Neural Networks for Mobile Devices, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp.4820-4828, 2016.
DOI : 10.1109/CVPR.2016.521

J. David, K. Kalach, and N. Tittley, Hardware Complexity of Modular Multiplication and Exponentiation, IEEE Transactions on Computers, vol.56, issue.10, pp.1308-1319, 2007.
DOI : 10.1109/TC.2007.1084

D. Williamson, Dynamically scaled xed point arithmetic, Proceedings of the IEEE Paciic Rim Conference on Communications, Computers and Signal Processing Conference, pp.315-318, 1991.
DOI : 10.1109/pacrim.1991.160742

S. Guo, L. Wang, B. Chen, Q. Dou, Y. Tang et al., FixCaae: Training CNN with Low Precision Arithmetic Operations by Fixed Point Caae, Proceedings of the International Workshop on Advanced Parallel Processing Technologies -APPT '17, pp.38-50
DOI : 10.1007/978-3-319-67952-5_4

H. Nakahara, T. Fujii, and S. Sato, A fully connected layer elimination for a binarizec convolutional neural network on an FPGA, 2017 27th International Conference on Field Programmable Logic and Applications (FPL), pp.1-4
DOI : 10.23919/FPL.2017.8056771

I. Hubara, M. Courbariaux, and D. Soudry, Ran El-Yaniv, and Yoshua Bengio, Binarized neural networks. In Advances in Neural Information Processing Systems -NIPS'16, pp.4107-4115

C. Zhu, S. Han, H. Mao, and W. J. Dally, Trained Ternary Quantization, Proceedings of the International Conference on Learning Representations -ICLR'17, p.2017

R. Zhao, W. Song, W. Zhang, T. Xing, J. Lin et al., Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '17, 2017.
DOI : 10.1145/2897937.2898003

A. Prostboucle, A. Bourge, F. Pétrot, H. Alemdar, N. Caldwell et al., Scalable High-Performance Architecture for Convolutional Ternary Neural Networks on FPGA, Proceedings of the International Conference on Field Programmable Logic and Applications -FPL '17, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01701116

A. Alaghi, P. John, and . Hayes, Fast and Accurate Computation using Stochastic Circuits, Proceedings of the Conference on Design, Automation and Test in Europe -DATE '14, 2014.
DOI : 10.7873/date.2014.089

B. Liu, M. Wang, H. Foroosh, M. Tappen, and M. Pensky, Sparse Convolutional Neural Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition -CVPR '15, pp.806-814, 2015.

S. Han, J. Pool, J. Tran, J. William, and . Dally, Learning both Weights and Connections for EEcient Neural Network, Advances in Neural Information Processing Systems -NIPS'15, pp.1135-1143, 2015.

T. Yang, Y. Chen, and V. Sze, Designing Energy-EEcient Convolutional Neural Networks using Energy-Aware Pruning, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition -CVPR '17, 2017.
DOI : 10.1109/cvpr.2017.643

URL : http://arxiv.org/pdf/1611.05128

S. Han, H. Mao, and W. J. Dally, Deep Compression -Compressing Deep Neural Networks with Pruning, Trained Quantization and Huuman Coding, Proceedings of the International Conference on Learning Representations -ICLR'16, pp.1-13, 2016.

A. Sironi, B. Tekin, R. Rigamonti, V. Lepetit, and P. Fua, Learning Separable Filters, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.37, issue.1, pp.94-106, 2015.
DOI : 10.1109/TPAMI.2014.2343229

R. Dorrance, F. Ren, and D. Markovi?, A scalable sparse matrix-vector multiplication kernel for energy-eecient sparse-blas on FPGAs, Proceedings of the ACM/SIGDA International Symposium on Field-Programmable Gate Arrays -FPGA '14, pp.161-170, 2014.