A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks, Communications of the ACM, vol.60, issue.6, 2012.
DOI : 10.1162/neco.2009.10-08-881
URL : http://dl.acm.org/ft_gateway.cfm?id=3065386&type=pdf

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proceedings of the IEEE, 1998.
DOI : 10.1109/5.726791

K. Ovtcharov, O. Ruwase, J. Kim, J. Fowers, K. Strauss et al., Accelerating deep convolutional neural networks using specialized hardware, 2015.

C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello et al., NeuFlow: A runtime reconfigurable dataflow processor for vision, CVPR 2011 WORKSHOPS
DOI : 10.1109/CVPRW.2011.5981829
URL : http://yann.lecun.com/exdb/publis/pdf/farabet-ecvw-11.pdf

. Altera, FPGAs Achieve Compelling Performance-per-Watt in Cloud Data Center Acceleration Using CNN Algorithms, 2015.

G. Lacey, G. W. Taylor, and A. , Deep Learning on FPGAs: Past, Present, and Future. ArXiv e-prints, 2016.

J. Cloutier, E. Cosatto, and S. Pigeon, VIP: an FPGA-based processor for image processing and neural networks, Proceedings of Fifth International Conference on Microelectronics for Neural Networks, 1996.
DOI : 10.1109/MNNFS.1996.493811

S. Chakradhar, M. Sankaradas, V. Jakkula, and S. Cadambi, A dynamically configurable coprocessor for convolutional neural networks
DOI : 10.1145/1816038.1815993

M. Peemen, A. Setio, B. Mesman, and H. Corporaal, Memory-centric accelerator design for Convolutional Neural Networks, 2013 IEEE 31st International Conference on Computer Design (ICCD)
DOI : 10.1109/ICCD.2013.6657019

C. Farabet, C. Poulet, J. Y. Han, and Y. Lecun, CNP: An FPGA-based processor for Convolutional Networks, 2009 International Conference on Field Programmable Logic and Applications, 2009.
DOI : 10.1109/FPL.2009.5272559

C. Zhang, P. Li, G. Sun, Y. Guan, B. Xiao et al., Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks, Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, FPGA '15
DOI : 10.1145/1498765.1498785

F. Bastien, P. Lamblin, and G. , Theano: new features and speed improvements

Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, and L. , Caffe, Proceedings of the ACM International Conference on Multimedia, MM '14, 2014.
DOI : 10.1145/2647868.2654889

B. Jack, D. P. Dennis, and . Misunas, A preliminary architecture for a basic data-flow processor. ISCA '75

M. Abadi, TensorFlow: Large-scale machine learning on heterogeneous systems, 2015

H. Trinh, &. Marc-duranton, and &. Michel-paindavoine, Efficient Data Encoding for Convolutional Neural Network application, ACM Transactions on Architecture and Code Optimization, vol.11, issue.4, 2015.
DOI : 10.1109/TSP.2008.919386

K. Anwar, . Hwang, and . Sung, Fixed point optimization of deep convolutional neural networks for object recognition, 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015.
DOI : 10.1109/ICASSP.2015.7178146

S. Gupta, A. Agrawal, and P. Narayanan, Deep learning with limited numerical precision, Conference Proceedings, 2015.

V. Gokhale, J. Jin, A. Dundar, B. Martini, and E. Culurciello, A 240 G-ops/s Mobile Coprocessor for Deep Neural Networks, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops, 2014.
DOI : 10.1109/CVPRW.2014.106

J. Sérot and F. Berry, High-Level Dataflow Programming for Reconfigurable Computing, 2014 International Symposium on Computer Architecture and High Performance Computing Workshop, 2014.
DOI : 10.1109/SBAC-PADW.2014.18