Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

Robert Stewart; Andrew Nowlan; Pascal Bacchus; Quentin Ducasse; Ekaterina Komendantskaya

doi:10.3390/electronics10040396

Article Dans Une Revue Electronics Année : 2021

Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

(1) , (1) , , (2, 3) , (1)

1
2
3

Robert Stewart

Fonction : Auteur correspondant

School of Mathematical and Computer Sciences

Andrew Nowlan

Fonction : Auteur

School of Mathematical and Computer Sciences

Pascal Bacchus

Fonction : Auteur

Quentin Ducasse

Fonction : Auteur
PersonId : 1289530
IdHAL : qducasse
ORCID : 0000-0001-9927-675X

École Nationale Supérieure de Techniques Avancées Bretagne

Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance

Ekaterina Komendantskaya

Fonction : Auteur

School of Mathematical and Computer Sciences

Résumé

This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughputperformance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62x speedup.

Domaines

Architectures Matérielles [cs.AR] Intelligence artificielle [cs.AI]

Fichier principal

Stewart-2021-Optimising Hardware Accelerated Neural Networks.pdf (1.32 Mo)

Origine : Fichiers éditeurs autorisés sur une archive ouverte

Quentin Ducasse : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03141719

Soumis le : lundi 25 octobre 2021-13:15:26

Dernière modification le : mercredi 7 février 2024-08:55:27

Archivage à long terme le : mercredi 26 janvier 2022-20:06:24

Dates et versions

hal-03141719 , version 1 (25-10-2021)

Licence

Paternité

Identifiants

HAL Id : hal-03141719 , version 1
DOI : 10.3390/electronics10040396

Citer

Robert Stewart, Andrew Nowlan, Pascal Bacchus, Quentin Ducasse, Ekaterina Komendantskaya. Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm. Electronics, 2021, 10 (4), pp.1-21. ⟨10.3390/electronics10040396⟩. ⟨hal-03141719⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-BREST INSTITUT-TELECOM CNRS LAB-STICC_UBO ENIB LAB-STICC

49 Consultations

130 Téléchargements

Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

Résumé

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager