Towards high performance stochastic arithmetic

Pacôme Eberhart; Julien Brajard; Pierre Fortin; Fabienne Jézéquel

Communication Dans Un Congrès Année : 2014

Towards high performance stochastic arithmetic

(1) , (2) , (1) , (1)

1
2

Pacôme Eberhart

Fonction : Auteur
PersonId : 971508

Performance et Qualité des Algorithmes Numériques

Julien Brajard

Fonction : Auteur
PersonId : 13461
IdHAL : julien-brajard
ORCID : 0000-0003-0634-1482
IdRef : 115106359

Processus de la variabilité climatique tropicale et impacts

Pierre Fortin

Fonction : Auteur
PersonId : 2113
IdHAL : pierre-fortin
ORCID : 0000-0003-3117-9122
IdRef : 11411255X

Performance et Qualité des Algorithmes Numériques

Fabienne Jézéquel

Fonction : Auteur
PersonId : 9537
IdHAL : fabienne-jezequel
ORCID : 0000-0002-8782-7566
IdRef : 140519335

Performance et Qualité des Algorithmes Numériques

Résumé

Because of the finite representation of floating-point numbers in computers, the results of arithmetic operations need to be rounded. The CADNA library [1],based on discrete stochastic arithmetic [2], can be used to estimate the propagation of rounding errors in scientific codes. By synchronously computing each operation three times with a randomly chosen rounding mode, CADNA estimates the number of exact significant digits of the result within a 95% confidence interval. To ensure the validity of the method and allow a better analysis of the program, several types of anomalies are checked at execution time. However, the overhead on computation time can be of up to 80 times depending on the program and on the level of anomaly detection [3]. There are two main factors that can explain this: the cost of anomaly detection and that of stochastic operations. Firstly, cancellation (sudden loss of accuracy in a single operation) detection is based on the computation of the number of exact significant digits that relies on a logarithmic evaluation. This mathematical function is much more costly than floating-point arithmetic operations. Secondly, the stochastic operators are currently implemented through the overloading of arithmetic operators and the change of the rounding mode of the FPU (Floating Point Unit). However, this method makes vectorization impossible, as each vector lane would need a different rounding mode. Moreover, it causes performance overhead due to function calls and to the flushing of the FPU pipelines, respectively. This implies an even greater performance drop for HPC applications that rely on SIMD (Single Instruction Multiple Data) processing and on pipeline filling for better efficiency. To bypass these overheads and allow the use of vector instructions for SIMD parallelism, we propose several improvements in the CADNA library. Since only the integer part of the number of exact significant digits is required, we can use the exponent of a floating-point value as an approximation of the logarithm evaluation, which removes the logarithm function call. To avoid the cost of function calls, we propose to inline the stochastic operators. Finally, rather than depending on the rounding modes of the FPU, we compute the randomly rounded arithmetic operations by handling the sign bit of the operands through masks. These contributions provide a speedup factor of up to 2.5 on a scalar code. They also enable the use of CADNA with vectorized code: SIMD performance results on high-end CPUs and on an Intel Xeon Phi are presented.

Domaines

Informatique [cs]

Lip6 Publications : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01217244

Soumis le : lundi 19 octobre 2015-11:26:16

Dernière modification le : vendredi 19 avril 2024-16:18:58

Dates et versions

hal-01217244 , version 1 (19-10-2015)

Identifiants

HAL Id : hal-01217244 , version 1

Citer

Pacôme Eberhart, Julien Brajard, Pierre Fortin, Fabienne Jézéquel. Towards high performance stochastic arithmetic. 16th international symposium on Scientific Computing, Computer Arithmetic and Validated Numerics (SCAN 2014), Sep 2014, Würzburg, Germany. pp.47-48. ⟨hal-01217244⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CEA INSU MNHN UNIV-PARIS7 X ENS-PARIS UPMC CNRS LOCEAN LIP6 UVSQ PSL SORBONNE-UNIVERSITE SU-SCIENCES FR-636 UP-SCIENCES IPSL_LOCEAN

92 Consultations

0 Téléchargements

Towards high performance stochastic arithmetic

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager