Best known method: Avoid heterogeneous precision in control flow calculations, Intel, Tech. Rep, p.480, 2013. ,
Accuracy and stability of numerical algorithms, 2014, july) N-body: Fp atomics v. recomputation, 2002. ,
Sofa an open source framework for medical simulation, Medicine Meets Virtual Reality (MMVR'15), 2007. ,
URL : https://hal.archives-ouvertes.fr/inria-00319416
Determinism and reproducibility in large-scale HPC systems, Informal Proceedings of the 4th Workshop on Determinism and Correctness in Parallel Programming, 2013. ,
Accelerating SQL database operations on a GPU with CUDA, Proceedings of 3rd Workshop on General Purpose Processing on Graphics Processing Units, GPGPU 2010, vol.425, pp.94-103, 2010. ,
Full-Speed Deterministic Bit-Accurate Parallel Floating-Point Summation on Multi-and Many-Core Architectures, 2014. ,
Parallel reproducible summation, IEEE Trans. Computers, vol.64, issue.7, pp.2060-2070, 2015. ,
DOI : 10.1109/tc.2014.2345391
Accuracy and stability of numerical algorithms, 2002. ,
Handbook of floating-point arithmetic, 2010. ,
URL : https://hal.archives-ouvertes.fr/ensl-00379167
The GPU computing era, IEEE Micro, vol.30, pp.56-69, 2010. ,
DOI : 10.1109/mm.2010.41
Impacting predictability of gpu's, 2013. ,
LU , QR and Cholesky factorizations using vector capabilities of GPUs, 2008. ,
To gpu synchronize or not gpu synchronize, Circuits and Systems (ISCAS), Proceedings of 2010 IEEE International Symposium on, pp.3801-3804, 2010. ,
Inter-block GPU communication via fast barrier synchronization, IPDPS, pp.1-12, 2010. ,
,
Efficient synchronization primitives for GPUs, CoRR, 2011. ,
CUDA by example: an introduction to general-purpose GPU programming. pub-AW:adr, 2010. ,
, ser. de Gruyter Studies in Mathematics, vol.33, 2013.
GNU MP: The GNU Multiple Precision Arithmetic Library ,
Comments on fast and exact accumulation of products, Applied Parallel and Scientific Computing, pp.148-156, 2012. ,