Performance Modeling and Estimation of a Configurable Output Stationary Neural Network Accelerator

Ali Oudrhiri; Emilien Taly; Nathan Bain; Alix Munier-Kordon; Roberto Guizzetti; Pascal Urard

Pré-Publication, Document De Travail Année : 2023

Performance Modeling and Estimation of a Configurable Output Stationary Neural Network Accelerator

(1, 2) , (1) , (1) , (2) , (1) , (1)

1
2

Ali Oudrhiri

Fonction : Auteur

STMicroelectronics [Crolles]

Architecture et Logiciels pour Systèmes Embarqués sur Puce

Emilien Taly

Fonction : Auteur

STMicroelectronics [Crolles]

Nathan Bain

Fonction : Auteur

STMicroelectronics [Crolles]

Alix Munier-Kordon

Fonction : Auteur
PersonId : 9539
IdHAL : alix-munier-kordon
ORCID : 0000-0002-2170-6366
IdRef : 159831458

Architecture et Logiciels pour Systèmes Embarqués sur Puce

Roberto Guizzetti

Fonction : Auteur

STMicroelectronics [Crolles]

Pascal Urard

Fonction : Auteur

STMicroelectronics [Crolles]

Résumé

Neural network accelerators are designed to process Neural Networks (NN) optimizing three Key Performance Indicators (KPIs): latency, power, and chip area. This work is based on the study of Gemini, an industrial prototype near memory computing inference accelerator designed using a high-level synthesis technique. Gemini is an output stationary configurable accelerator that achieves its performance based on two structural parameters. The measurement of the KPIs requires simulations that are time-consuming and resource-intensive. This paper presents a high-level practical estimator that can instantly predict the KPIs depending on the NN and the Gemini configuration. The latency is accurately derived using an analytical model based on the architecture, the operators scheduling and the NN characteristics. The power and the chip area are computed analytically and the models are calibrated using simulations. Finally, we show how to use the estimator to derive Pareto optima for choosing the best Gemini configurations for a VGG-like NN.

Mots clés

Neural network accelerator output stationary estimation latency power area

Domaines

Réseau de neurones [cs.NE]

Fichier principal

archi_explorer_retarget (1).pdf (1.26 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

ali oudrhiri : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04168803

Soumis le : mercredi 26 juillet 2023-14:14:07

Dernière modification le : samedi 7 octobre 2023-21:36:22

Archivage à long terme le : vendredi 27 octobre 2023-18:34:36

Dates et versions

hal-04168803 , version 1 (26-07-2023)

Identifiants

HAL Id : hal-04168803 , version 1

Citer

Ali Oudrhiri, Emilien Taly, Nathan Bain, Alix Munier-Kordon, Roberto Guizzetti, et al.. Performance Modeling and Estimation of a Configurable Output Stationary Neural Network Accelerator. 2023. ⟨hal-04168803⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LIP6 SORBONNE-UNIVERSITE SU-SCIENCES

58 Consultations

59 Téléchargements

Performance Modeling and Estimation of a Configurable Output Stationary Neural Network Accelerator

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager