D. H. Bailey, K. Lee, and H. Simon, Using Strassen's algorithm to accelerate the solution of linear systems, The Journal of Supercomputing, vol.13, issue.3, pp.357-37110, 1991.
DOI : 10.1007/BF00129836

A. R. Benson and G. Ballard, A framework for practical parallel fast matrix multiplication, Proceedings of the 20th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2015, pp.42-53

L. Bettale, J. Faugre, and L. Perret, Cryptanalysis of hfe, multi-hfe and variants for odd and even characteristic. Designs, Codes and Cryptography, pp.1-52
URL : https://hal.archives-ouvertes.fr/hal-00776072

F. Broquedis, T. Gautier, and V. Danjean, libKOMP, an Efficient OpenMP Runtime System for Both Fork-Join and Data Flow Paradigms, IWOMP, pp.102-115, 2012.
DOI : 10.1007/978-3-642-30961-8_8

URL : https://hal.archives-ouvertes.fr/hal-00796253

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009.
DOI : 10.1016/j.parco.2008.10.002

P. , M. Bodrato, and A. Nicolau, Exploiting parallelism in matrix-computation kernels for symmetric multiprocessor systems: Matrix-multiplication and matrix-addition algorithm optimizations by software pipelining and threads allocation, ACM Trans. Math. Softw, vol.382, issue.1, pp.1-2, 2011.

J. Dongarra, J. Du-croz, I. Duff, and S. Hammarling, A proposal for a set of level 3 basic linear algebra subprograms, ACM SIGNUM Newsletter, vol.22, issue.3, pp.2-14, 1987.
DOI : 10.1145/36318.36319

J. J. Dongarra, M. Faverge, H. Ltaief, and P. Luszczek, Achieving numerical accuracy and high performance using recursive tile LU factorization. Concurrency and Computation: Practice and Experience, pp.1408-1431, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00809765

J. Dumas, T. Gautier, and C. Pernet, Finite field linear algebra subroutines, Proceedings of the 2002 international symposium on Symbolic and algebraic computation , ISSAC '02, pp.10-1145, 2002.
DOI : 10.1145/780506.780515

J. Dumas, T. Gautier, C. Pernet, and Z. Sultan, Parallel Computation of Echelon Forms, Euro-Par 2014 Parallel Processing, pp.499-51010, 2014.
DOI : 10.1007/978-3-319-09873-9_42

URL : https://hal.archives-ouvertes.fr/hal-00947013

J. Dumas, P. Giorgi, and C. Pernet, Dense Linear Algebra over Word-Size Prime Fields, ACM Transactions on Mathematical Software, vol.35, issue.3, pp.1-42, 2008.
DOI : 10.1145/1391989.1391992

URL : https://hal.archives-ouvertes.fr/hal-00018223

J. Dumas, C. Pernet, and Z. Sultan, Simultaneous computation of the row and column rank profiles, Proceedings of the 38th international symposium on International symposium on symbolic and algebraic computation, ISSAC '13, pp.181-188, 2013.
DOI : 10.1145/2465506.2465517

URL : https://hal.archives-ouvertes.fr/hal-00778136

J. Dumas, C. Pernet, and Z. Sultan, Computing the Rank Profile Matrix, Proceedings of the 2015 ACM on International Symposium on Symbolic and Algebraic Computation, ISSAC '15, pp.149-156, 2015.
DOI : 10.1145/2755996.2756682

URL : https://hal.archives-ouvertes.fr/hal-01107722

J. Faugère, A new efficient algorithm for computing Gröbner bases (F4), Journal of Pure and Applied Algebra, vol.139, pp.1-361, 1999.

J. Zur-gathen and J. Gerhard, Modern Computer Algebra, 2013.
DOI : 10.1017/CBO9781139856065

T. Gautier, J. V. Ferreira-lima, N. Maillard, and B. Raffin, XKaapi: A Runtime System for Data-Flow Task Programming on Heterogeneous Architectures, 2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013.
DOI : 10.1109/IPDPS.2013.66

URL : https://hal.archives-ouvertes.fr/hal-00799904

A. M. Gleixner, D. E. Steffy, and K. Wolter, Improving the accuracy of linear programming solvers with iterative refinement, Proceedings of the 37th International Symposium on Symbolic and Algebraic Computation, ISSAC '12, pp.187-194
DOI : 10.1145/2442829.2442858

L. Grigori, J. W. Demmel, and H. Xiang, CALU: A Communication Optimal LU Factorization Algorithm, SIAM Journal on Matrix Analysis and Applications, vol.32, issue.4, pp.1317-1350, 2011.
DOI : 10.1137/100788926

URL : https://hal.archives-ouvertes.fr/hal-00651137

F. G. Gustavson, Recursion leads to automatic variable blocking for dense linear-algebra algorithms, IBM Journal of Research and Development, vol.41, issue.6, pp.737-756, 1997.
DOI : 10.1147/rd.416.0737

C. Jeannerod, C. Pernet, and A. Storjohann, Rank-profile revealing Gaussian elimination and the CUP matrix decomposition, Journal of Symbolic Computation, vol.56, pp.46-68, 2013.
DOI : 10.1016/j.jsc.2013.04.004

URL : https://hal.archives-ouvertes.fr/hal-00655543

J. Jelinek, The GNU OpenMP implementation URL: https, 2014.

K. Klimkowski and R. A. Van-de-geijn, Anatomy of a parallel out-of-core dense linear solver, ICPP, pp.29-33, 1995.

B. Kumar, C. Huang, R. Johnson, and P. Sadayappan, A Tensor Product Formulation of Strassen's Matrix Multiplication Algorithm with Memory Reduction, Parallel Processing Symposium Proceedings of Seventh International, pp.582-588, 1993.
DOI : 10.1155/1995/636457

B. Lipshitz, G. Ballard, J. Demmel, and O. Schwartz, Communication-Avoiding Parallel Strassen: Implementation and performance, 2012 International Conference for High Performance Computing, Networking, Storage and Analysis, pp.1-101
DOI : 10.1109/SC.2012.33

A. Rémy, M. Baboulin, M. Sosonkina, B. Rozoy, M. Bader et al., Locality optimization on a NUMA architecture for hybrid LU factorization URL: https, Proc. Parallel Computing, ParCo 2013, pp.10-13, 2013.

W. Stein, Modular forms, a computational approach Graduate studies in mathematics, 2007.

V. Strassen, Gaussian elimination is not optimal, Numerische Mathematik, vol.13, issue.4, pp.354-35610, 1969.
DOI : 10.1007/BF02165411

S. Toledo, Locality of Reference in LU Decomposition with Partial Pivoting, SIAM Journal on Matrix Analysis and Applications, vol.18, issue.4, pp.1065-1081, 1997.
DOI : 10.1137/S0895479896297744

Z. Zlatev, Computational Methods for General Sparse Matrices, 1991.
DOI : 10.1007/978-94-017-1116-6