H. Alemdar, V. Leroy, A. Prost-boucle, and F. Pétrot, Ternary Neural Networks for ResourceEfficient AI Applications, 30th International Joint Conference on Neural Networks. 2547-2554. Training code available at, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01570788

R. Andri, L. Cavigelli, D. Rossi, and L. Benini, YodaNN: An Architecture for Ultra-Low Power Binary-Weight CNN Acceleration, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 2017.

K. Batcher, The architecture of tomorrow's massively parallel computer, 1987.

M. Courbariaux, Y. Bengio, and J. David, Binaryconnect: Training deep neural networks with binary weights during propagations, Advances in Neural Information Processing Systems, pp.3123-3131, 2015.

G. Desoli, N. Chawla, T. Boesch, S. Singh, E. Guidetti et al., Harvinder Singh, and Nalin Aggarwal. 2017. 14.1 A 2.9 TOPS/W deep convolutional neural network SoC in FD-SOI 28nm for intelligent embedded systems, IEEE International Solid-State Circuits Conference, pp.238-239

N. J. Fraser, Y. Umuroglu, G. Gambardella, M. Blott, and P. Leong, Scaling Binarized Neural Networks on Reconfigurable Logic, Proceedings of the 8th Workshop and 6th Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms, pp.25-30, 2017.

S. Han, X. Liu, H. Mao, J. Pu, A. Pedram et al., EIE: efficient inference engine on compressed deep neural network, Proceedings of the 43rd International Symposium on Computer Architecture, pp.243-254, 2016.

L. Hou, Q. Yao, and J. T. Kwok, Loss-aware Binarization of Deep Networks, 5th International Conference on Learning Representations. C18, 2017.

I. Hubara, M. Courbariaux, D. Soudry, R. El-yaniv, and Y. Bengio, Quantized neural networks: Training neural networks with low precision weights and activations, 2016.

K. Hwang and W. Sung, Fixed-point feedforward deep neural network design using weights-1, 0, and +1, IEEE Workshop on Signal Processing Systems (SiPS, pp.1-6, 2014.

M. Jacobsen, D. Richmond, M. Hogains, and R. Kastner, RIFFA 2.1: A reusable integration framework for FPGA accelerators, ACM Transactions on Reconfigurable Technology and Systems, vol.8, issue.4, p.23, 2015.

D. Kim, J. Ahn, and S. Yoo, A novel zero weight/activation-aware hardware architecture of convolutional neural network, Design, Automation & Test in Europe Conference & Exhibition. IEEE, pp.1462-1467, 2017.

D. E. Knuth, The Art of Computer Programming, vol.2, 1997.

A. Krizhevsky, Learning Multiple Layers of Features from Tiny Images, 2009.

M. Kumm and P. Zipf, Pipelined compressor tree optimization using integer linear programming, 24th International Conference on Field Programmable Logic and Applications, pp.1-8, 2014.

Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, Gradient-based learning applied to document recognition, Proc. IEEE, vol.86, pp.2278-2324, 1998.

F. Li, B. Zhang, and B. Liu, Ternary weight networks, 2016.

Y. Li, Z. Liu, K. Xu, H. Yu, and F. Ren, A 7.663-TOPS 8.2-W Energy-efficient FPGA Accelerator for Binary Convolutional Neural Networks, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp.290-291, 2017.
URL : https://hal.archives-ouvertes.fr/pasteur-00823211

Z. Liu, Y. Dou, J. Jiang, J. Xu, S. Li et al., Throughput-Optimized FPGA Accelerator for Deep Convolutional Neural Networks, ACM Transactions on Reconfigurable Technology and Systems, vol.10, issue.3, p.23, 2017.

J. M. Duncan, E. Moss, J. Nurvitadhi, A. Sim, D. Mishra et al., High performance binary neural networks on the Xeon+FPGA platform, 27th International Conference on Field Programmable Logic and Applications, pp.1-4, 2017.

H. Nakahara, T. Fujii, and S. Sato, A fully connected layer elimination for a binarized convolutional neural network on an FPGA, 2017 27th International Conference on Field Programmable Logic and Applications, pp.1-4, 2017.

Y. Netzer, T. Wang, A. Coates, A. Bissacco, B. Wu et al., Reading Digits in Natural Images with Unsupervised Feature Learning, NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011.

E. Nurvitadhi, G. Venkatesh, J. Sim, D. Marr, R. Huang et al., Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA '17), pp.5-14, 2017.

J. Park and W. Sung, FPGA based implementation of deep neural networks using on-chip memory only, IEEE International Conference on Acoustics, Speech and Signal Processing, pp.1011-1015, 2016.

A. Pedram, S. Richardson, M. Horowitz, S. Galal, and S. Kvatinsky, Dark memory and accelerator-rich system optimization in the dark silicon era, IEEE Design & Test, vol.34, pp.39-50, 2017.

A. Prost-boucle, A. Bourge, F. Pétrot, H. Alemdar, N. Caldwell et al., Scalable High-Performance Architecture for Convolutional Ternary Neural Networks on FPGA, 27th International Conference on Field Programmable Logic and Applications, pp.1-7, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01563763

M. Rastegari, V. Ordonez, J. Redmon, and A. Farhadi, Xnor-net: Imagenet classification using binary convolutional neural networks, European Conference on Computer Vision, pp.525-542, 2016.

K. Simonyan and A. Zisserman, Very deep convolutional networks for large-scale image recognition, 2014.

P. ?koda, T. Lipi?, and À. Srp, Implementation framework for Artificial Neural Networks on FPGA, 2011 Proceedings of the 34th International Convention MIPRO, pp.274-278, 2011.

J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel, Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition, International Joint Conference on Neural Networks, 2011.

C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, 2016.

O. Temam, The rebirth of neural networks. Keynote speach at the International Symposium on Computer Architecture, 2010.
URL : https://hal.archives-ouvertes.fr/inria-00535554

Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, and P. Leong, Finn: A framework for fast, scalable binarized neural network inference, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp.65-74, 2017.

Y. Umuroglu, N. J. Fraser, G. Gambardella, M. Blott, P. Leong et al., FINN: A Framework for Fast, Scalable Binarized Neural Network Inference, 2016.

K. Vissers, A Framework for Reduced Precision Neural Networks on FPGA, 17th International Forum on MPSoC. slides available at, 2017.

E. Wu, X. Zhang, D. Berman, and I. Cho, A high-throughput reconfigurable processing array for neural networks, 27th International Conference on Field Programmable Logic and Applications, pp.1-4, 2017.

R. Zhao, W. Song, W. Zhang, T. Xing, J. Lin et al., Accelerating Binarized Convolutional Neural Networks with Software-Programmable FPGAs, Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp.15-24, 2017.

C. Zhu, S. Han, H. Mao, and W. J. Dally, Trained ternary quantization, 2017.