GPU-accelerated generation of correctly-rounded elementary functions - Archive ouverte HAL Accéder directement au contenu
Pré-Publication, Document De Travail Année : 2013

GPU-accelerated generation of correctly-rounded elementary functions

Résumé

The IEEE 754-2008 standard recommends the correct rounding of some elementary functions. This requires to solve the Table Maker's Dilemma which implies a huge amount of CPU computation time. We consider in this paper accelerating such computations, namely Lefe'vre algorithm on Graphics Processing Units (GPUs) which are massively parallel architectures with a partial SIMD execution (Single Instruction Multiple Data). We first propose an analysis of the Lefèvre hard-to-round argument search using the concept of continued fractions. We then propose a new parallel search algorithm much more efficient on GPU thanks to its more regular control flow. We also present an efficient hybrid CPU-GPU deployment of the generation of the polynomial approximations required in Lefèvre algorithm. In the end, we manage to obtain overall speedups up to 53.4x on one GPU over a sequential CPU execution, and up to 7.1x over a multi-core CPU, which enable a much faster solving of the Table Maker's Dilemma for the double precision format.
Fichier principal
Vignette du fichier
article.pdf (284.69 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00751446 , version 1 (13-11-2012)
hal-00751446 , version 2 (05-06-2013)

Identifiants

Citer

Pierre Fortin, Mourad Gouicem, Stef Graillat. GPU-accelerated generation of correctly-rounded elementary functions. 2013. ⟨hal-00751446v1⟩
334 Consultations
331 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More