S. Thibault, R. Namyst, and P. Wacrenier, Building Portable Thread Schedulers for Hierarchical Multiprocessors: The BubbleSched Framework, European Conference on Parallel Computing (Euro-Par), 2007.
DOI : 10.1007/978-3-540-74466-5_6
URL : https://hal.archives-ouvertes.fr/inria-00154506

F. Broquedis, J. Clet-ortega, S. Moreaud, N. Furmento, B. Goglin et al., hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications, 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010.
DOI : 10.1109/PDP.2010.67
URL : https://hal.archives-ouvertes.fr/inria-00429889

F. Broquedis, F. Diakhat, S. Thibault, O. Aumage, R. Namyst et al., Scheduling Dynamic OpenMP Applications over Multicore Architectures, International Workshop on OpenMP (IWOMP), 2008.
DOI : 10.1007/978-3-540-79561-2_15
URL : https://hal.archives-ouvertes.fr/inria-00329934

E. Ayguade, M. Gonzalez, X. Martorell, and G. Jost, Employing Nested OpenMP for the Parallelization of Multi- Zone Computational Fluid Dynamics Applications, 18th International Parallel and Distributed Processing Symposium (IPDPS), 2004.

H. Jin, B. Chapman, L. Huang, D. Mey, and T. Reichstein, Performance Evaluation of a Multi-Zone Application in Different OpenMP Approaches, International Journal of Parallel Programming, vol.36, issue.3, pp.312-325, 2008.
DOI : 10.1007/s10766-008-0074-5

B. M. Chapman, L. Huang, H. Jin, G. Jost, and B. R. De-supinski, Extending OpenMP Worksharing Directives for Multithreading, European Conference on Parallel Computing (Euro-Par), 2006.

F. Broquedis, N. Furmento, B. Goglin, R. Namyst, and P. Wacrenier, Dynamic Task and Data Placement over NUMA Architectures: An OpenMP Runtime Perspective, Evolving OpenMP in an Age of Extreme Parallelism, 5th International Workshop on OpenMP, pp.79-92, 2009.
DOI : 10.1007/978-3-540-74466-5_6
URL : https://hal.archives-ouvertes.fr/inria-00367570

S. Benkner and T. Brandes, Efficient parallel programming on scalable shared memory systems with High Performance Fortran, Concurrency: Practice and Experience, pp.789-803, 2002.
DOI : 10.1002/cpe.649

B. M. Chapman, F. Bregier, A. Patil, and A. Prabhakar, Achieving performance under OpenMP on ccNUMA and software distributed shared memory systems, Concurrency: Practice and Experience, pp.713-739, 2002.
DOI : 10.1002/cpe.646

]. D. Nikolopoulos, T. S. Papatheodorou, C. D. Polychronopoulos, J. Labarta, and E. Ayguad, User-level dynamic page migration for multiprogrammed shared-memory multiprocessors, Proceedings 2000 International Conference on Parallel Processing, pp.95-103, 2000.
DOI : 10.1109/ICPP.2000.876083

H. Lf and S. Holmgren, affinity-on-next-touch: increasing the performance of an industrial PDE solver on a cc-NUMA system, 19th ACM International Conference on Supercomputing, pp.387-392, 2005.

C. Terboven, D. Mey, D. Schmidl, H. Jin, and T. Reichstein, Data and thread affinity in openmp programs, Proceedings of the 2008 workshop on Memory access on future processors a solved problem?, MAW '08, pp.377-384, 2008.
DOI : 10.1145/1366219.1366222

X. Martorell, E. Ayguad, N. Navarro, J. Corbaln, M. Gonzlez et al., Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors, Proceedings of the 13th international conference on Supercomputing , ICS '99, pp.294-301, 1999.
DOI : 10.1145/305138.305206

Y. Tanaka, K. Taura, M. Sato, and A. Yonezawa, Performance Evaluation of OpenMP Applications with Nested Parallelism, Languages, Compilers, and Run-Time Systems for Scalable Computers, pp.100-112, 2000.
DOI : 10.1007/3-540-40889-4_8

B. Saha, A. Adl-tabatabai, R. L. Hudson, V. Menon, T. Shpeisman et al., Runtime Environment for Tera-scale Platforms, Intel Technology Journal, vol.11, issue.3, 2007.