GPU-Accelerated Generation of Correctly Rounded Elementary Functions - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue ACM Transactions on Mathematical Software Année : 2016

GPU-Accelerated Generation of Correctly Rounded Elementary Functions

Mourad Gouicem
  • Fonction : Auteur
  • PersonId : 976563

Résumé

The IEEE 754-2008 standard recommends the correct rounding of some elementary functions. This requires solving the Table Maker’s Dilemma (TMD), which implies a huge amount of CPU computation time. In this article, we consider accelerating such computations, namely the Lefèvre algorithm on graphics processing units (GPUs), which are massively parallel architectures with a partial single instruction, multiple data execution. We first propose an analysis of the Lefèvre hard-to-round argument search using the concept of continued fractions. We then propose a new parallel search algorithm that is much more efficient on GPUs thanks to its more regular control flow. We also present an efficient hybrid CPU-GPU deployment of the generation of the polynomial approximations required in the Lefèvre algorithm. In the end, we manage to obtain overall speedups up to 53.4× on one GPU over a sequential CPU execution and up to 7.1× over a hex-core CPU, which enable a much faster solution of the TMD for the double-precision format.
Fichier principal
Vignette du fichier
article.pdf (275.79 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00751446 , version 1 (13-11-2012)
hal-00751446 , version 2 (05-06-2013)

Identifiants

Citer

Pierre Fortin, Mourad Gouicem, Stef Graillat. GPU-Accelerated Generation of Correctly Rounded Elementary Functions. ACM Transactions on Mathematical Software, 2016, 43 (3), pp.22:1--22:26. ⟨10.1145/2935746⟩. ⟨hal-00751446v2⟩
333 Consultations
327 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More