High-Efficiency Convolutional Ternary Neural Networks with Custom Adder Trees and Weight Compression

Adrien Prost-Boucle; Alban Bourge; Frédéric Pétrot

doi:10.1145/3294768

Article Dans Une Revue ACM Transactions on Reconfigurable Technology and Systems (TRETS) Année : 2018

High-Efficiency Convolutional Ternary Neural Networks with Custom Adder Trees and Weight Compression

(1) , (1) , (1)

Adrien Prost-Boucle

Fonction : Auteur
PersonId : 12629
IdHAL : adrien-prost-boucle
IdRef : 192295306

Techniques de l'Informatique et de la Microélectronique pour l'Architecture des systèmes intégrés

Alban Bourge

Fonction : Auteur
PersonId : 4391
IdHAL : alban-bourge
IdRef : 199376557

Techniques de l'Informatique et de la Microélectronique pour l'Architecture des systèmes intégrés

Frédéric Pétrot

Fonction : Auteur
PersonId : 12920
IdHAL : frederic-petrot
ORCID : 0000-0003-0624-7373
IdRef : 108969223

Techniques de l'Informatique et de la Microélectronique pour l'Architecture des systèmes intégrés

Résumé

Although performing inference with artiicial neural networks (ANN) was until quite recently considered as essentially compute intensive, the emergence of deep neural networks coupled with the evolution of the integration technology transformed inference into a memory bound problem. This ascertainment being established, many works have lately focused on minimizing memory accesses, either by enforcing and exploiting sparsity on weights or by using few bits for representing activations and weights, so as to be able to use ANNs inference in embedded devices. In this work, we detail an architecture dedicated to inference using ternary {−1, 0, 1} weights and activations. This architecture is conngurable at design time to provide throughput vs power trade-oos to choose from. It is also generic in the sense that it uses information drawn for the target technologies (memory geometries and cost, number of available cuts, etc) to adapt at best to the FPGA resources. This allows to achieve up to 5.2k fps per Watt for classiication on a VC709 board using approximately half of the resources of the FPGA. Additional Key Words and Phrases: Ternary CNN, low power inference, hardware acceleration, FPGA ACM Reference format: Adrien Prost-Boucle, Alban Bourge, and Frédéric Pétrot. 2018. High-EEciency Convolutional Ternary Neural Networks with Custom Adder Trees and Weight Compression.

Domaines

Architectures Matérielles [cs.AR] Réseau de neurones [cs.NE]

Fichier principal

trets_nocopyright.pdf (1.52 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Alban Bourge : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01686718

Soumis le : lundi 7 janvier 2019-09:18:45

Dernière modification le : jeudi 4 avril 2024-20:55:16

Archivage à long terme le : lundi 8 avril 2019-13:51:00

Dates et versions

hal-01686718 , version 1 (24-01-2018)

hal-01686718 , version 2 (07-01-2019)

Licence

CC0 - Transfert dans le Domaine Public

Identifiants

HAL Id : hal-01686718 , version 2
DOI : 10.1145/3294768

Citer

Adrien Prost-Boucle, Alban Bourge, Frédéric Pétrot. High-Efficiency Convolutional Ternary Neural Networks with Custom Adder Trees and Weight Compression. ACM Transactions on Reconfigurable Technology and Systems (TRETS), 2018, Special Issue on Deep learning on FPGAs, 11 (3), pp.1-24. ⟨10.1145/3294768⟩. ⟨hal-01686718v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS TIMA

534 Consultations

1533 Téléchargements

High-Efficiency Convolutional Ternary Neural Networks with Custom Adder Trees and Weight Compression

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager