More Instruction Level Parallelism Explains the Actual Efficiency of Compensated Algorithms

Abstract : The compensated Horner algorithm and the Horner algorithm with double-double arithmetic improve the accuracy of polynomial evaluation in IEEE-754 floating point arithmetic. Both yield a polynomial evaluation as accurate as if it was computed with the classic Horner algorithm in twice the working precision. Both algorithms also share the same low-level computation of the floating point rounding errors and cost a similar number of floating point operations. We report numerical experiments to exhibit that the compensated algorithm runs at least twice as fast as the double-double one on modern processors. We propose to explain such efficiency by identifying more instruction level parallelism in the compensated implementation. Such property also applies to other compensated algorithms for summation, dot product and triangular linear system solving. More generally this paper illustrates how this kind of performance analysis may be useful to highlight the actual efficiency of numerical algorithms.
Type de document :
Pré-publication, Document de travail
11 pages. 2007
Liste complète des métadonnées

Littérature citée [14 références]  Voir  Masquer  Télécharger
Contributeur : Philippe Langlois <>
Soumis le : mardi 24 juillet 2007 - 14:57:29
Dernière modification le : mardi 24 juillet 2007 - 15:44:37
Document(s) archivé(s) le : jeudi 8 avril 2010 - 23:55:36


Fichiers produits par l'(les) auteur(s)


  • HAL Id : hal-00165020, version 1



Philippe Langlois, Nicolas Louvet. More Instruction Level Parallelism Explains the Actual Efficiency of Compensated Algorithms. 11 pages. 2007. 〈hal-00165020〉



Consultations de la notice


Téléchargements de fichiers