Computing floating-point logarithms with fixed-point operations

Abstract : Elementary functions from the mathematical library input and output floating-point numbers. However it is possible to implement them purely using integer/fixed-point arithmetic. This option was not attractive between 1985 and 2005, because mainstream processor hardware supported 64-bit floating-point, but only 32-bit integers. Besides, conversions between floating-point and integer were costly. This has changed in recent years, in particular with the generalization of native 64-bit integer support. The purpose of this article is therefore to reevaluate the relevance of computing floating-point functions in fixed-point. For this, several variants of the double-precision logarithm function are implemented and evaluated. Formulating the problem as a fixed-point one is easy after the range has been (classically) reduced. Then, 64-bit integers provide slightly more accuracy than 53-bit mantissa, which helps speed up the evaluation. Finally, multi-word arithmetic, critical for accurate implementations, is much faster in fixed-point, and natively supported by recent compilers. Novel techniques of argument reduction and rounding test are introduced in this context. Thanks to all this, a purely integer implementation of the correctly rounded double-precision logarithm outperforms the previous state of the art, with the worst-case execution time reduced by a factor 5. This work also introduces variants of the logarithm that input a floating-point number and output the result in fixed-point. These are shown to be both more accurate and more efficient than the traditional floating-point functions for some applications.
Type de document :
Communication dans un congrès
23rd IEEE Symposium on Computer Arithmetic, Jul 2016, Santa Clara, United States. 2016
Liste complète des métadonnées

Littérature citée [17 références]  Voir  Masquer  Télécharger
Contributeur : Florent De Dinechin <>
Soumis le : jeudi 12 novembre 2015 - 11:10:25
Dernière modification le : mercredi 11 avril 2018 - 01:55:22
Document(s) archivé(s) le : vendredi 28 avril 2017 - 08:30:28


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-01227877, version 1


Julien Le Maire, Nicolas Brunie, Florent De Dinechin, Jean-Michel Muller. Computing floating-point logarithms with fixed-point operations. 23rd IEEE Symposium on Computer Arithmetic, Jul 2016, Santa Clara, United States. 2016. 〈hal-01227877〉



Consultations de la notice


Téléchargements de fichiers