S. Albensoeder and H. C. Kuhlmann, Accurate three-dimensional lid-driven cavity flow, Journal of Computational Physics, vol.206, issue.2, pp.536-558, 2005.
DOI : 10.1016/j.jcp.2004.12.024

R. , C. Whaley, A. Petitet, and J. J. Dongarra, Automated empirical optimizations of software and the ATLAS project, Parallel Computing, vol.27, issue.1, pp.3-35, 2001.

D. Crockford, The application/json Media Type for JavaScript Object Notation (JSON), RFC 4627, 2006.

D. Humières, I. Ginzburg, M. Krafczyk, P. Lallemand, and L. S. Luo, Multiple-relaxation-time lattice Boltzmann models in three dimensions, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, vol.360, issue.1792, pp.437-451, 2002.
DOI : 10.1098/rsta.2001.0955

Z. Fan, F. Qiu, A. Kaufman, and S. Yoakum-stover, GPU cluster for high performance computing, Proceedings of the 2004 ACM/IEEE conference on Supercomputing, pp.47-58, 2004.

P. Geoffray, L. Prylli, and B. Tourancheau, BIP-SMP, Proceedings of the 1999 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '99, pp.20-38, 1999.
DOI : 10.1145/331532.331552

P. Geoffray, C. Pham, and B. Tourancheau, A Software Suite for High- Performance Communications on Clusters of SMPs, Cluster Computing, vol.5, issue.4, pp.353-363, 2002.
DOI : 10.1023/A:1019756120212

X. He and L. Luo, Theory of the lattice Boltzmann method: From the Boltzmann equation to the lattice Boltzmann equation, Physical Review E, vol.56, issue.6, pp.6811-6817, 1997.
DOI : 10.1103/PhysRevE.56.6811

W. Li, X. Wei, and A. Kaufman, Implementing lattice Boltzmann computation on graphics hardware, The Visual Computer, vol.Techniques, issue.7-8, pp.444-456, 2003.
DOI : 10.1007/s00371-003-0210-6

C. Obrecht, F. Kuznik, B. Tourancheau, and J. Roux, Global Memory Access Modelling for Efficient Implementation of the Lattice Boltzmann Method on Graphics Processing Units, High Performance Computing for Computational Science, VECPAR 2010 Revised Selected Papers, pp.151-161, 2011.
DOI : 10.1016/j.jcp.2003.08.008

URL : https://hal.archives-ouvertes.fr/inria-00563159

C. Obrecht, F. Kuznik, B. Tourancheau, and J. Roux, A new approach to the lattice Boltzmann method for graphics processing units, Computers & Mathematics with Applications, vol.61, issue.12, pp.3628-3638, 2011.
DOI : 10.1016/j.camwa.2010.01.054

URL : https://hal.archives-ouvertes.fr/inria-00568674

C. Obrecht, F. Kuznik, B. Tourancheau, and J. Roux, The TheLMA project: Multi-GPU implementation of the lattice Boltzmann method, International Journal of High Performance Computing Applications, vol.25, issue.3, pp.295-303, 2011.
DOI : 10.1177/1094342011414745

URL : https://hal.archives-ouvertes.fr/hal-00731122

F. Song, S. Tomov, and J. Dongarra, Efficient Support for Matrix Computations on Heterogeneous Multi-core and Multi-GPU Architectures, 2011.

J. Tölke and M. Krafczyk, TeraFLOP computing on a desktop PC with GPUs for 3D CFD, International Journal of Computational Fluid Dynamics, vol.77, issue.7, pp.443-456, 2008.
DOI : 10.1002/cav.143

X. Wang and T. Aoki, Multi-GPU performance of incompressible flow computation by lattice Boltzmann method on GPU cluster, Parallel Computing, vol.37, issue.9, pp.521-535, 2011.