Another possible evolution of our solver is towards distributed memory, parallel systems: modern runtime systems, such as StarPU, are capable of handling this type of architectures by transparently managing the transfer of data between nodes through the network. A solver that implements all the above-mentioned features is our ultimate objective, Sparse QR factorization on the GPU, 2015. ,
A CPU???GPU hybrid approach for the unsymmetric multifrontal method, Parallel Computing, vol.37, issue.12, pp.759-770, 2011. ,
DOI : 10.1016/j.parco.2011.09.002
Scheduling a Parallel Sparse Direct Solver to Multiple GPUs, 2013 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum, pp.1401-1408, 2013. ,
DOI : 10.1109/IPDPSW.2013.26
A Sparse Symmetric Indefinite Direct Solver for GPU Architectures, Tech. Rep. RAL-P, 2014. ,
DOI : 10.1145/2756548
GPU-accelerated sparse LU factorization for circuit simulation with performance modeling Parallel and Distributed Systems, IEEE Transactions on, vol.26, issue.3, pp.786-795, 2015. ,
A Distributed CPU-GPU Sparse Direct Solver, Euro-Par 2014 Parallel Processing, pp.487-498, 2014. ,
DOI : 10.1007/978-3-319-09873-9_41
A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009. ,
DOI : 10.1016/j.parco.2008.10.002
Programming matrix algorithms-by-blocks for thread-level parallelism, ACM Transactions on Mathematical Software, vol.36, issue.3, 2009. ,
DOI : 10.1145/1527286.1527288
Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series, vol.180, issue.1, p.12037, 2009. ,
DOI : 10.1088/1742-6596/180/1/012037
Dense linear algebra on distributed heterogeneous hardware with a symbolic dag approach, Scalable Computing and Communications: Theory and Practice, 2013. ,
StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures Concurrency and Computation: Practice and Experience, Special Issue: Euro-Par, pp.187-198, 2009. ,
DOI : 10.1007/978-3-642-03869-3_80
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.220.5547
DAGuE: A generic distributed DAG engine for High Performance Computing, Parallel Computing, vol.38, issue.1-2, pp.37-51, 2012. ,
DOI : 10.1016/j.parco.2011.10.003
Parallelizing dense and banded linear algebra libraries using SMPSs, Concurrency and Computation: Practice and Experience, pp.2438-2456, 2009. ,
DOI : 10.1002/cpe.1463
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.140.3457
Multi-GPU and Multi-CPU Parallelization for Interactive Physics Simulations, Euro-Par, pp.2010-235 ,
DOI : 10.1007/978-3-642-15291-7_23
URL : https://hal.archives-ouvertes.fr/inria-00502448
Taking Advantage of Hybrid Systems for Sparse Direct Solvers via Task-Based Runtimes, 2014 IEEE International Parallel & Distributed Processing Symposium Workshops, p.5, 2014. ,
DOI : 10.1109/IPDPSW.2014.9
URL : https://hal.archives-ouvertes.fr/hal-00925017
A Parallel Sparse Direct Solver via Hierarchical DAG Scheduling, ACM Transactions on Mathematical Software, vol.41, issue.1, pp.1-3, 2014. ,
DOI : 10.1145/2629641
Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems, ACM Transactions on Mathematical Software, vol.43, issue.2 ,
DOI : 10.1145/2898348
URL : https://hal.archives-ouvertes.fr/hal-01333645
Algorithm 915, SuiteSparseQR, ACM Transactions on Mathematical Software, vol.38, issue.1, pp.1-822, 2011. ,
DOI : 10.1145/2049662.2049670
The Multifrontal Solution of Indefinite Sparse Symmetric Linear, ACM Transactions on Mathematical Software, vol.9, issue.3, pp.302-325, 1983. ,
DOI : 10.1145/356044.356047
A New Implementation of Sparse Gaussian Elimination, ACM Transactions on Mathematical Software, vol.8, issue.3, pp.256-276, 1982. ,
DOI : 10.1145/356004.356006
Multifrontal QR Factorization in a Multiprocessor Environment, Numerical Linear Algebra with Applications, vol.8, issue.89, pp.275-300, 1996. ,
DOI : 10.1002/(SICI)1099-1506(199607/08)3:4<275::AID-NLA83>3.0.CO;2-7
Fine-Grained Multithreading for the Multifrontal $QR$ Factorization of Sparse Matrices, SIAM Journal on Scientific Computing, vol.35, issue.4, pp.323-345, 2013. ,
DOI : 10.1137/110846427
URL : https://hal.archives-ouvertes.fr/hal-01122471
Fully dynamic scheduler for numerical computing on multicore processors, LAPACK working note, 2009. ,
Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Transactions on Parallel and Distributed Systems, vol.13, issue.3, pp.260-274, 2002. ,
DOI : 10.1109/71.993206
Hierarchical DAG Scheduling for Hybrid Distributed Systems, 2015 IEEE International Parallel and Distributed Processing Symposium, 2015. ,
DOI : 10.1109/IPDPS.2015.56
URL : https://hal.archives-ouvertes.fr/hal-01078359
Task scheduling for parallel sparse Cholesky factorization, International Journal of Parallel Programming, vol.27, issue.4, pp.291-314, 1989. ,
DOI : 10.1007/BF01407861
Task-based FMM for heterogeneous architectures, Concurrency and Computation: Practice and Experience, vol.7490, issue.9, 2014. ,
DOI : 10.1002/cpe.3723
URL : https://hal.archives-ouvertes.fr/hal-00974674