Numerical linear algebra on emerging architectures: The PLASMA and MAGMA projects, Journal of Physics: Conference Series, p.12037, 2009. ,
DOI : 10.1088/1742-6596/180/1/012037
hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, pp.180-186, 2010. ,
DOI : 10.1109/PDP.2010.67
URL : https://hal.archives-ouvertes.fr/inria-00429889
Generalized lattice-Boltzmann equations Rarefied gas dynamics-Theory and simulations, pp.450-458, 1994. ,
Multiple-relaxation-time lattice Boltzmann models in three dimensions, Philosophical Transactions: Mathematical, Physical and Engineering Sciences, pp.437-451, 2002. ,
Exploring New Architectures in Accelerating CFD for Air Force Applications, 2008 DoD HPCMP Users Group Conference, pp.14-17, 2008. ,
DOI : 10.1109/DoD.HPCMP.UGC.2008.12
GPU cluster for high performance computing, Proceedings of the 2004 ACM/IEEE conference on Supercomputing, p.47, 2004. ,
Lattice-Gas Automata for the Navier-Stokes Equation, Physical Review Letters, vol.56, issue.14, pp.1505-1508, 1986. ,
DOI : 10.1103/PhysRevLett.56.1505
Palabos Benchmarks (3D Lid-driven Cavity on Blue Gene/P) ,
Use of the Boltzmann Equation to Simulate Lattice-Gas Automata, Physical Review Letters, vol.61, issue.20, pp.2332-2335, 1988. ,
DOI : 10.1103/PhysRevLett.61.2332
A new approach to the lattice Boltzmann method for graphics processing units, Computers & Mathematics with Applications, vol.61, issue.12, 2010. ,
DOI : 10.1016/j.camwa.2010.01.054
URL : https://hal.archives-ouvertes.fr/inria-00568674
Global Memory Access Modelling for Efficient Implementation of the LBM on GPUs, High Performance Computing for Computational Science ? VECPAR2010. Lecture Notes in Computer Science, 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-01003059
Implementation of a??Lattice???Boltzmann method for numerical fluid mechanics using the nVIDIA CUDA technology, Computer Science - Research and Development, vol.8, issue.4, pp.241-247, 2009. ,
DOI : 10.1007/s00450-009-0087-3
Optimizing matrix transpose in CUDA. NVIDIA CUDA SDK Application Note, 2009. ,
Implementation of a Lattice Boltzmann kernel using the Compute Unified Device Architecture developed by nVIDIA, Computing and Visualization in Science, vol.17, issue.4, pp.1-11, 2008. ,
DOI : 10.1007/s00791-008-0120-2
TeraFLOP computing on a desktop PC with GPUs for 3D CFD, International Journal of Computational Fluid Dynamics, vol.77, issue.7, pp.443-456, 2008. ,
DOI : 10.1002/cav.143