. Openmp, Simple, portable, scalable smp programming, 2006.

W. Nasri, L. A. Steffenel, and D. Trystram, Adaptive approaches for efficient parallel algorithms on cluster-based systems, International Journal of Grid and Utility Computing, vol.1, issue.2, pp.99-108, 2009.
DOI : 10.1504/IJGUC.2009.022026

URL : https://hal.archives-ouvertes.fr/hal-00953258

C. Liao and B. Chapman, Invited Paper: A Compile-time Cost Model for OpenMP, 2007 IEEE International Parallel and Distributed Processing Symposium, p.208, 2007.
DOI : 10.1109/IPDPS.2007.370398

R. Hockney, The communication challenge for MPP: Intel Paragon and Meiko CS-2, Parallel Computing, vol.20, issue.3, pp.389-398, 1994.
DOI : 10.1016/S0167-8191(06)80021-9

T. Kielmann, H. Bal, S. Gorlatch, K. Verstoep, and R. Hofman, Network performance-aware collective communication for clustered wide-area systems, Parallel Computing, vol.27, issue.11, pp.1431-1456, 2001.
DOI : 10.1016/S0167-8191(01)00098-9

M. E. Wolf, D. E. Maydan, and D. K. Chen, Combining loop transformations considering caches and scheduling, Proceedings of the 29th Annual IEEE/ACM International Symposium on Microarchitecture. MICRO 29, pp.274-286, 1996.
DOI : 10.1109/MICRO.1996.566468

K. Yotov, X. Li, G. Ren, M. Cibulskis, G. Dejong et al., A comparison of empirical and model-driven optimization, PLDI '03: Proceedings of the ACM SIGPLAN 2003 conference on Programming language design and implementation, pp.63-76, 2003.

B. Chapman, G. Jost, and R. Van-der-pas, Using OpenMP -Portable Shared Memory Parallel Programming, 2008.

R. C. Whaley, A. Petitet, and J. J. Dongarra, Automated empirical optimization of software and the ATLAS project, ) 3?35 Also available as University of Tennessee LAPACK Working Note #147, UT-CS-00-448, 2000.

J. Bilmes, K. Asanovi?, C. Whye-chin, and J. Demmel, Optimizing matrix multiply using PHiPAC: a Portable, High-Performance, ANSI C coding methodology, Proceedings of International Conference on Supercomputing, 1997.

M. O. Mccracken, A. Snavely, and A. Malony, Performance Modeling for Dynamic Algorithm Selection, International Conference on Computational Science, pp.749-758, 2003.
DOI : 10.1007/3-540-44864-0_77

M. Frigo and S. G. Johnson, The Design and Implementation of FFTW3, Proceedings of the IEEE, vol.93, issue.2, pp.216-231, 2005.
DOI : 10.1109/JPROC.2004.840301

F. Berman, R. Wolski, H. Casanova, W. Cirne, H. Dail et al., Adaptive computing on the grid using AppLeS, IEEE Transactions on Parallel and Distributed Systems, vol.14, issue.4, 2003.
DOI : 10.1109/TPDS.2003.1195409

O. Hartmann, M. Kuhnemann, T. Rauber, and G. Runger, Adaptive selection of communication methods to optimize collective mpi operations, Proceedings of the 12th Workshop on Compilers for Parallel Computers (CPC'06), 2006.

P. Bhat, V. Prasanna, and C. Raghavendra, Adaptive communication algorithms for distributed heterogeneous systems, Proceedings of the IEEE International Symposium on High Performance Distributed Computing, 1998.

B. Hong and V. K. Prasanna, Adaptive matrix multiplication in heterogeneous environments, Ninth International Conference on Parallel and Distributed Systems, 2002. Proceedings., p.129, 2002.
DOI : 10.1109/ICPADS.2002.1183389

H. Yu and L. Rauchwerger, An adaptive algorithm selection framework for reduction parallelization, IEEE Transactions on Parallel and Distributed Systems, 2006.

N. Thomas, G. Tanase, O. Tkachyshyn, J. Perdue, N. M. Amato et al., A framework for adaptive algorithm selection in STAPL, Proceedings of the tenth ACM SIGPLAN symposium on Principles and practice of parallel programming , PPoPP '05, pp.277-288, 2005.
DOI : 10.1145/1065944.1065981

J. Cuenca, D. Giménez, J. González, J. Dongarra, and K. Roche, Automatic optimisation of parallel linear algebra routines in systems with variable load, Eleventh Euromicro Conference on Parallel, Distributed and Network-Based Processing, 2003. Proceedings., pp.409-416, 2003.
DOI : 10.1109/EMPDP.2003.1183618

D. Chandra, F. Guo, S. Kim, and Y. Solihin, Predicting Inter-Thread Cache Contention on a Chip Multi-Processor Architecture, 11th International Symposium on High-Performance Computer Architecture, pp.340-351, 2005.
DOI : 10.1109/HPCA.2005.27

R. H. Saavedra and A. J. Smith, Analysis of benchmark characteristics and benchmark performance prediction, ACM Transactions on Computer Systems, vol.14, issue.4, pp.344-384, 1996.
DOI : 10.1145/235543.235545

A. Snavely, N. Wolter, and L. Carrington, Modeling application performance by convolving machine signatures with application profiles, Proceedings of the Fourth Annual IEEE International Workshop on Workload Characterization. WWC-4 (Cat. No.01EX538), pp.194-156, 2001.
DOI : 10.1109/WWC.2001.990754

R. Jin and G. Agrawal, Performance prediction for random write reductions: a case study in modeling shared memory programs, SIGMETRICS'02: Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems, pp.117-128, 2002.

V. S. Adve and M. K. Vernon, Parallel program performance prediction using deterministic task graph analysis, ACM Transactions on Computer Systems, vol.22, issue.1, pp.94-136, 2004.
DOI : 10.1145/966785.966788