Performance and accuracy of the matrix multiplication routines : CUBLAS on Nvidia Tesla versus MKL and ATLAS on Intel Nehalem - Archive ouverte HAL Accéder directement au contenu
Rapport Année : 2010

Performance and accuracy of the matrix multiplication routines : CUBLAS on Nvidia Tesla versus MKL and ATLAS on Intel Nehalem

Résumé

Scientific computation relies heavily on 64 bits arithmetic. The evolution of the Graphical Processing Units to the status of massively micro-parallel vector units and the improvement of their programmability make them stand as powerfull algebraic coprocessors for many classes of matrix calculus. But on these processors inheriting from architectures dedicated to video processing in the first place, the space for double precision is narrow yet. One building block of dense linear algebra, the GEneralized Matrix Multiply Routine has been considerably accelerated on the GPU. We figure in this paper more details regarding its speed, but first and foremost, accuracy.
Fichier principal
Vignette du fichier
MatmulNumaccCUDA.pdf (196.76 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00699377 , version 1 (21-05-2012)

Identifiants

  • HAL Id : hal-00699377 , version 1

Citer

Philippe Estival, Luc Giraud. Performance and accuracy of the matrix multiplication routines : CUBLAS on Nvidia Tesla versus MKL and ATLAS on Intel Nehalem. 2010. ⟨hal-00699377⟩

Collections

LARA
175 Consultations
3047 Téléchargements

Partager

Gmail Facebook X LinkedIn More