Skip to Main content Skip to Navigation
Conference papers

Towards Reproducible Blocked LU Factorization

Abstract : In this article, we address the problem of reproducibility of the blocked LU factorization on GPUs due to cancellations and rounding errors when dealing with floating-point arithmetic. Thanks to the hierarchical structure of linear algebra libraries, the computations carried within this operation can be expressed in terms of the Level-3 BLAS routines as well as the unblocked variant; the latter is correspon-dently built upon the Level-1/2 BLAS kernels. In addition, we strengthen numerical stability of the blocked LU factorization via partial row pivoting. Therefore, we propose a double-layer bottom-up approach for ensuring reproducibility of the blocked LU factorization and provide experimental results for its underlying blocks.
Complete list of metadatas

Cited literature [21 references]  Display  Hide  Download
Contributor : Roman Iakymchuk <>
Submitted on : Wednesday, March 22, 2017 - 11:42:55 AM
Last modification on : Thursday, March 21, 2019 - 1:06:45 PM
Long-term archiving on: : Friday, June 23, 2017 - 12:42:55 PM


Files produced by the author(s)


  • HAL Id : hal-01456307, version 2


Roman Iakymchuk, Enrique Quintana-Ortí, Erwin Laure, Stef Graillat. Towards Reproducible Blocked LU Factorization. 4th International Workshop on Reproducibility in Parallel Computing in conjunction with IPDPS 2017 - 31st IEEE International Parallel & Distributed Processing Symposium, May 2017, Orlando, United States. ⟨hal-01456307v2⟩



Record views


Files downloads