StarPU: A Unified Platform for Task Scheduling on Heterogeneous Multicore Architectures Concurrency and Computation: Practice and Experience, Special Issue: Euro- Par, 2009. ,
Implementing OmpSs support for regions of data in architectures with multiple address spaces, Proceedings of the 27th international ACM conference on International conference on supercomputing, ICS '13, 2013. ,
DOI : 10.1145/2464996.2465017
Locality-aware work stealing on multi-cpu and multi-gpu architectures, Workshop on Programmability Issues for Heterogeneous Multicores (MULTIPROG), 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00780890
StarPU-MPI: Task Programming over Clusters of Machines Enhanced with Accelerators INRIA, Rapport de recherche RR-8538, 2014. ,
Harnessing Supercomputers with a Sequential Task-based Runtime System, INRIA, Tech. Rep ,
Boundary element methods, " in Boundary Element Methods, ser, 2011. ,
Distributed dense numerical linear algebra algorithms on massively parallel architectures, 2010. ,
Hierarchical DAG Scheduling for Hybrid Distributed Systems, 2015 IEEE International Parallel and Distributed Processing Symposium, 2012. ,
DOI : 10.1109/IPDPS.2015.56
URL : https://hal.archives-ouvertes.fr/hal-01078359
A Scheduling and Runtime Framework for a Cluster of Heterogeneous Machines with Multiple Accelerators, 2015 IEEE International Parallel and Distributed Processing Symposium, 2015. ,
DOI : 10.1109/IPDPS.2015.12
Optimizing a parallel runtime system for multicore clusters, Proceedings of the 2010 TeraGrid Conference on, TG '10, 2010. ,
DOI : 10.1145/1838574.1838586
Parallel Scheduling of Task Trees with Limited Memory, ACM Transactions on Parallel Computing, vol.2, issue.2, 2015. ,
DOI : 10.1145/2779052
URL : https://hal.archives-ouvertes.fr/hal-01160118
A study of memory-aware scheduling in message driven prallel programs, International Conference on High Performance Computing, 2010. ,
Bounded memory scheduling of dynamic task graphs, International Conference on Prallel Architectures and Compilation, 2014. ,
Resource-Aware Task Scheduling, Workshop on Parallel Programming and Run-Time Management Techniques for Many-core Architectures, 2013. ,
DOI : 10.1145/2638554
List scheduling in embedded systems under memory constraints, International Symposium on Computer Architecture and High Performance Computing, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00906117
Memory Analysis and Optimized Allocation of Dataflow Applications on Shared-Memory MPSoCs, Journal of VLSI Signal Processing-Systems for Signal, Image, and Video Technology, 2014. ,
DOI : 10.1007/s11265-014-0952-6
URL : https://hal.archives-ouvertes.fr/hal-01083576
Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems, ACM Transactions on Mathematical Software, vol.43, issue.2, 2014. ,
DOI : 10.1145/2898348
URL : https://hal.archives-ouvertes.fr/hal-01333645
Communication-optimal parallel 2.5d matrix multiplicatoin and lu factorization algorithms, International conference on Parallel processing Euro-Par, 2011. ,
Multifrontal QR Factorization for Multicore Architectures over Runtime Systems, Euro-Par 2013 Parallel Processing, 2013. ,
DOI : 10.1007/978-3-642-40047-6_53
URL : https://hal.archives-ouvertes.fr/hal-01220611
Approximation of boundary element matrices, Numerische Mathematik, vol.86, issue.4, 2000. ,
DOI : 10.1007/PL00005410
A look at scalable dense linear algebra libraries, Scalable High Performance Computing Conference, 1992. ,