Towards Reproducible Blocked LU Factorization

Abstract : In this article, we address the problem of reproducibility of the blocked LU factorization on GPUs due to cancellations and rounding errors when dealing with floating-point arithmetic. Thanks to the hierarchical structure of linear algebra libraries, the computations carried within this operation can be expressed in terms of the Level-3 BLAS routines as well as the unblocked variant; the latter is correspon-dently built upon the Level-1/2 BLAS kernels. In addition, we strengthen numerical stability of the blocked LU factorization via partial row pivoting. Therefore, we propose a double-layer bottom-up approach for ensuring reproducibility of the blocked LU factorization and provide experimental results for its underlying blocks.
Complete list of metadatas

Cited literature [21 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01456307
Contributor : Roman Iakymchuk <>
Submitted on : Wednesday, March 22, 2017 - 11:42:55 AM
Last modification on : Thursday, March 21, 2019 - 1:06:45 PM
Long-term archiving on : Friday, June 23, 2017 - 12:42:55 PM

File

REPPAR-05.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01456307, version 2

Citation

Roman Iakymchuk, Enrique Quintana-Ortí, Erwin Laure, Stef Graillat. Towards Reproducible Blocked LU Factorization. 4th International Workshop on Reproducibility in Parallel Computing in conjunction with IPDPS 2017 - 31st IEEE International Parallel & Distributed Processing Symposium, May 2017, Orlando, United States. ⟨hal-01456307v2⟩

Share

Metrics

Record views

200

Files downloads

329