A Holistic Approach for Optimizing DSP Block Utilization of a CNN implementation on FPGA

Abstract : —Deep Neural Networks are becoming the de-facto standard models for image understanding, and more generally for computer vision tasks. As they involve highly paralleliz-able computations, Convolutional Neural Networks (CNNs) are well suited to current fine grain programmable logic devices. Thus, multiple CNN accelerators have been successfully implemented on FPGAs. Unfortunately, Field-Programmable Gate Array (FPGA) resources such as logic elements or Digital Signal Processing (DSP) units remain limited. This work presents a holistic method relying on approximate computing and design space exploration to optimize the DSP block utilization of a CNN implementation on FPGA. This method was tested when implementing a reconfigurable Optical Character Recognition (OCR) convolutional neural network on an Altera Stratix V device and varying both data representation and CNN topology in order to find the best combination in terms of DSP block utilization and classification accuracy. This exploration generated dataflow architectures of 76 CNN topologies with 5 different fixed point representation. Most efficient implementation performs 883 classifications/sec at 256 × 256 resolution using 8 % of the available DSP blocks.
Type de document :
Communication dans un congrès
Proceedings of the 10th International Conference on Distributed Smart Cameras - ICDSC'16, Sep 2016, Paris, France. Proceedings of the 10th International Conference on Distributed Smart Cameras - ICDSC'16
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01415955
Contributeur : Maxime Pelcat <>
Soumis le : mercredi 20 décembre 2017 - 11:17:11
Dernière modification le : vendredi 16 novembre 2018 - 01:28:46

Fichier

ICDSC_Main_Open.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01415955, version 1

Citation

Kamel Eddine Abdelouahab, Cédric Bourrasset, Maxime Pelcat, François Berry, Jean-Charles Quinton, et al.. A Holistic Approach for Optimizing DSP Block Utilization of a CNN implementation on FPGA. Proceedings of the 10th International Conference on Distributed Smart Cameras - ICDSC'16, Sep 2016, Paris, France. Proceedings of the 10th International Conference on Distributed Smart Cameras - ICDSC'16. 〈hal-01415955〉

Partager

Métriques

Consultations de la notice

382

Téléchargements de fichiers

63