Parallel computation of echelon forms

We propose efficient parallel algorithms and implementations on shared memory architectures of LU factorization over a finite field. Compared to the corresponding numerical routines, we have identified three main difficulties specific to linear algebra over finite fields. First, the arithmetic complexity could be dominated by modular reductions. Therefore, it is mandatory to delay as much as possible these reductions while mixing fine-grain parallelizations of tiled iterative and recursive algorithms. Second, fast linear algebra variants, e.g., using Strassen-Winograd algorithm, never suffer from instability and can thus be widely used in cascade with the classical algorithms. There, trade-offs are to be made between size of blocks well suited to those fast variants or to load and communication balancing. Third, many applications over finite fields require the rank profile of the matrix (quite often rank deficient) rather than the solution to a linear system. It is thus important to design parallel algorithms that preserve and compute this rank profile. Moreover, as the rank profile is only discovered during the algorithm, block size has then to be dynamic. We propose and compare several block decomposition: tile iterative with left-looking, right-looking and Crout variants, slab and tile recursive. Experiments demonstrate that the tile recursive variant performs better and matches the performance of reference numerical software when no rank deficiency occur. Furthermore, even in the most heterogeneous case, namely when all pivot blocks are rank deficient, we show that it is possbile to maintain a high efficiency.

Mots clés

Echelon form Parallel algorithm

Domaines

Calcul formel [cs.SC] Calcul parallèle, distribué et partagé [cs.DC]

Fichier principal

parallelPLUQ.pdf (251.3 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Jean-Guillaume Dumas : Connectez-vous pour contacter le contributeur

https://hal.science/hal-00947013

Soumis le : vendredi 14 février 2014-15:07:50

Dernière modification le : jeudi 4 avril 2024-21:22:44

Archivage à long terme le : jeudi 15 mai 2014-10:41:49

Dates et versions

hal-00947013 , version 1 (14-02-2014)

Identifiants

HAL Id : hal-00947013 , version 1
ARXIV : 1402.3501
DOI : 10.1007/978-3-319-09873-9_42

Citer

Jean-Guillaume Dumas, Thierry Gautier, Clément Pernet, Ziad Sultan. Parallel computation of echelon forms. EuroPar-2014 - 20th International Conference on Parallel Processing, Aug 2014, Porto, Portugal. pp.499-510, ⟨10.1007/978-3-319-09873-9_42⟩. ⟨hal-00947013⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

ENS-LYON UGA CNRS INRIA UNIV-LYON1 LIG LJK LJK_MAD LJK_MAD_CASYS LIG_SRCPR LIG_SRCPR_MOAIS INRIA2 UDL ANR LIG_SIDCH

662 Consultations

694 Téléchargements