Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Electronics Année : 2021

Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm

Résumé

This paper compares the latency, accuracy, training time and hardware costs of neural networks compressed with our new multi-objective evolutionary algorithm called NEMOKD, and with quantisation. We evaluate NEMOKD on Intel’s Movidius Myriad X VPU processor, and quantisation on Xilinx’s programmable Z7020 FPGA hardware. Evolving models with NEMOKD increases inference accuracy by up to 82% at the cost of 38% increased latency, with throughputperformance of 100–590 image frames-per-second (FPS). Quantisation identifies a sweet spot of 3 bit precision in the trade-off between latency, hardware requirements, training time and accuracy. Parallelising FPGA implementations of 2 and 3 bit quantised neural networks increases throughput from 6 k FPS to 373 k FPS, a 62x speedup.
Fichier principal
Vignette du fichier
Stewart-2021-Optimising Hardware Accelerated Neural Networks.pdf (1.32 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte

Dates et versions

hal-03141719 , version 1 (25-10-2021)

Licence

Paternité

Identifiants

Citer

Robert Stewart, Andrew Nowlan, Pascal Bacchus, Quentin Ducasse, Ekaterina Komendantskaya. Optimising Hardware Accelerated Neural Networks with Quantisation and a Knowledge Distillation Evolutionary Algorithm. Electronics, 2021, 10 (4), pp.1-21. ⟨10.3390/electronics10040396⟩. ⟨hal-03141719⟩
49 Consultations
130 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More