Performance Modeling and Estimation of a Configurable Output Stationary Neural Network Accelerator - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2023

Performance Modeling and Estimation of a Configurable Output Stationary Neural Network Accelerator

Résumé

Neural network accelerators are designed to process Neural Networks (NN) optimizing three Key Performance Indicators (KPIs): latency, power, and chip area. This work is based on the study of Gemini, an industrial prototype near memory computing inference accelerator designed using a high-level synthesis technique. Gemini is an output stationary configurable accelerator that achieves its performance based on two structural parameters. The measurement of the KPIs requires simulations that are time-consuming and resource-intensive. This paper presents a high-level practical estimator that can instantly predict the KPIs depending on the NN and the Gemini configuration. The latency is accurately derived using an analytical model based on the architecture, the operators scheduling and the NN characteristics. The power and the chip area are computed analytically and the models are calibrated using simulations. Finally, we show how to use the estimator to derive Pareto optima for choosing the best Gemini configurations for a VGG-like NN.
Fichier principal
Vignette du fichier
archi_explorer_retarget (1).pdf (1.26 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04168803 , version 1 (26-07-2023)

Identifiants

  • HAL Id : hal-04168803 , version 1

Citer

Ali Oudrhiri, Emilien Taly, Nathan Bain, Alix Munier-Kordon, Roberto Guizzetti, et al.. Performance Modeling and Estimation of a Configurable Output Stationary Neural Network Accelerator. 2023. ⟨hal-04168803⟩
58 Consultations
59 Téléchargements

Partager

Gmail Facebook X LinkedIn More