Bridging the gap between OpenMP 4.0 and native runtime systems for the fast multipole method, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01372022
Task-based FMM for heterogeneous architectures. Concurrency and Computation: Practice and Experience, pp.2608-2629, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01359458
Task-based fast multipole method for clusters of multicore processors, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01387482
Task-Based FMM for Multicore Architectures, SIAM Journal on Scientific Computing, vol.36, issue.1, pp.66-93, 2014. ,
DOI : 10.1137/130915662
URL : https://hal.archives-ouvertes.fr/hal-00807368
Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems, ACM Transactions on Mathematical Software, vol.43, issue.2, 2014. ,
DOI : 10.1145/2898348
URL : https://hal.archives-ouvertes.fr/hal-01333645
StarPU: A unified platform for task scheduling on heterogeneous multicore architectures. Concurrency and Computation: Practice and Experience, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00384363
How will the fast multipole method fare in the exascale era, SIAM News, vol.46, issue.6, pp.1-3, 2013. ,
Optimization and parallelization of the boundary element method for the wave equation in time domain. Theses, 2016. ,
URL : https://hal.archives-ouvertes.fr/tel-01306571
Fine-Grained Multithreading for the Multifrontal $QR$ Factorization of Sparse Matrices, SIAM Journal on Scientific Computing, vol.35, issue.4, pp.323-345, 2013. ,
DOI : 10.1137/110846427
URL : https://hal.archives-ouvertes.fr/hal-01122471
A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009. ,
DOI : 10.1016/j.parco.2008.10.002
Versatile, scalable, and accurate simulation of distributed applications and platforms, Journal of Parallel and Distributed Computing, vol.74, issue.10, p.74, 2014. ,
DOI : 10.1016/j.jpdc.2014.06.008
URL : https://hal.archives-ouvertes.fr/hal-01017319
A CPU, Proceedings of Workshop on General Purpose Processing Using GPUs, GPGPU-7, pp.64-64, 2014. ,
DOI : 10.1145/2588768.2576787
Performance Modeling Tools for Parallel Sparse Linear Algebra Computations In Parallel Computing: From Multicores and GPU's to Petascale, Proceedings of the conference ParCo, pp.83-90, 2009. ,
OmpSs: A PROPOSAL FOR PROGRAMMING HETEROGENEOUS MULTI-CORE ARCHITECTURES, Parallel Processing Letters, vol.21, issue.02, 2011. ,
DOI : 10.1142/S0129626411000151
A fast algorithm for particle simulations, Journal of Computational Physics, vol.73, issue.2, pp.325-348, 1987. ,
DOI : 10.1016/0021-9991(87)90140-9
Using MPI: Portable Parallel Programming with the Message Passing Interface. Scientific And Engineering Computation Series, 1999. ,
Parallel Simulation of Superscalar Scheduling, 2014 43rd International Conference on Parallel Processing, pp.121-130, 2014. ,
DOI : 10.1109/ICPP.2014.21
Superlu dist: A scalable distributedmemory sparse direct solver for unsymmetric linear systems, ACM Trans. Math. Softw, vol.29, issue.2, pp.110-140, 2003. ,
Converting thread-level parallelism to instruction-level parallelism via simultaneous multithreading, ACM Transactions on Computer Systems, vol.15, issue.3, pp.322-354, 1997. ,
DOI : 10.1145/263326.263382
Data-Driven Execution of Fast Multipole Methods. CoRR, abs, 1203. ,
Producing Wrong Data Without Doing Anything Obviously Wrong, Proceedings of the 14th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS XIV, pp.265-276, 2009. ,
DOI : 10.1145/1508244.1508275
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.163.8395
MARSS, Proceedings of the 48th Design Automation Conference on, DAC '11, pp.1050-1055, 2011. ,
DOI : 10.1145/2024724.2024954
R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, 2016. ,
Trace-driven simulation of multithreaded applications, (IEEE ISPASS) IEEE INTERNATIONAL SYMPOSIUM ON PERFORMANCE ANALYSIS OF SYSTEMS AND SOFTWARE, pp.87-96, 2011. ,
DOI : 10.1109/ISPASS.2011.5762718
The structural simulation toolkit, ACM SIGMETRICS Performance Evaluation Review, vol.38, issue.4, pp.37-42, 2011. ,
DOI : 10.1145/1964218.1964225
A Framework for Performance Modeling and Prediction, ACM/IEEE SC 2002 Conference (SC'02), pp.1-17, 2002. ,
DOI : 10.1109/SC.2002.10004
Fast and Accurate Simulation of Multithreaded Sparse Linear Algebra Solvers, 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS), 2015. ,
DOI : 10.1109/ICPADS.2015.67
URL : https://hal.archives-ouvertes.fr/hal-01180272
Faithful Performance Prediction of a Dynamic Task- Based Runtime System for Heterogeneous Multi-Core Architectures. Concurrency and Computation: Practice and Experience, p.16, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01147997
Guest editors' introduction: The top 10 algorithms, Computing in Science & Engineering, vol.2, issue.1, pp.22-23, 2000. ,
Performance-effective and low-complexity task scheduling for heterogeneous computing, IEEE Transactions on Parallel and Distributed Systems, vol.13, issue.3, pp.260-274, 2002. ,
DOI : 10.1109/71.993206
Are Cycle Accurate Simulations a Waste of Time?, Proc. of the 7th Workshop on Duplicating, Deconstruction and Debunking, 2008. ,
BigSim: A Parallel Simulator for Performance Prediction of Extremely Large Parallel Machines, Proc. of the 18th International Parallel and Distributed Processing Symposium (IPDPS), 2004. ,