C. Bienia, S. Kumar, J. P. Singh, and K. Li, The PARSEC benchmark suite, Proceedings of the 17th international conference on Parallel architectures and compilation techniques, PACT '08, 2008.
DOI : 10.1145/1454115.1454128

J. Edmonds, Maximum matching and a polyhedron with 0,1-vertices, Journal of Research of the National Bureau of Standards Section B Mathematics and Mathematical Physics, vol.69, issue.1 and 2, pp.125-130, 1965.
DOI : 10.6028/jres.069B.013

H. Jin, M. Frumkin, and J. Yan, The OpenMP implementation of NAS parallel benchmarks and its performance, Tech. rep, 1999.

M. Kandemir, T. Yemliha, S. Muralidhara, S. Srikantaiah, M. J. Irwin et al., Cache topology aware computation mapping for multicores, ACM SIGPLAN Notices, vol.45, issue.6, pp.74-85, 2010.
DOI : 10.1145/1809028.1806605

G. Karypis and V. Kumar, Parallel multilevel k-way partitioning scheme for irregular graphs, Proceedings of the 1996 ACM/IEEE conference on Supercomputing (CDROM) , Supercomputing '96, pp.96-129, 1998.
DOI : 10.1145/369028.369103

T. Klug, M. Ott, J. Weidendorfer, and C. Trinitis, autopin ? automated optimization of threadto-core pinning on multicore systems, Transactions on High-Performance Embedded Architectures and Compilers, 2008.

J. Lee, H. Wu, M. Ravichandran, and N. Clark, Thread tailor: dynamically weaving threads together for efficient, adaptive parallel applications, Proc. of the annual international symposium on Computer architecture, pp.270-279, 2010.

C. K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser et al., Pin: building customized program analysis tools with dynamic instrumentation, Proc. of the ACM SIGPLAN conference on Programming language design and implementation. pp. 190?200. PLDI '05, 2005.

A. Mazouz, S. A. Touati, and D. Barthou, Performance evaluation and analysis of thread pinning strategies on multi-core platforms: Case study of SPEC OMP applications on intel architectures, 2011 International Conference on High Performance Computing & Simulation, pp.273-279, 2011.
DOI : 10.1109/HPCSim.2011.5999834

URL : https://hal.archives-ouvertes.fr/inria-00636845

B. Mohr, A. D. Malony, S. Shende, and F. Wolf, Design and prototype of a performance tool interface for openmp, The Journal of Supercomputing, vol.23, issue.1, pp.105-128, 2002.
DOI : 10.1023/A:1015741304337

R. Jain, The Art of Computer Systems Performance Analysis : Techniques for Experimental Design, Measurement, Simulation, and Modelling, 1991.

F. Song, S. Moore, and J. Dongarra, Feedback-directed thread scheduling with memory considerations, Proceedings of the 16th international symposium on High performance distributed computing , HPDC '07, 2007.
DOI : 10.1145/1272366.1272380

F. Song, S. Moore, and J. Dongarra, Analytical modeling and optimization for affinity based thread scheduling on multicore systems, 2009 IEEE International Conference on Cluster Computing and Workshops, 2009.
DOI : 10.1109/CLUSTR.2009.5289173

D. Tam, R. Azimi, and M. Stumm, Thread clustering: sharing-aware scheduling on SMP- CMP-SMT multiprocessors, Proc. of theACM SIGOPS/EuroSys European Conference on Computer Systems EuroSys '07, pp.47-58, 2007.

C. Terboven, D. Mey, D. Schmidl, H. Jin, and T. Reichstein, Data and thread affinity in openmp programs, Proceedings of the 2008 workshop on Memory access on future processors a solved problem?, MAW '08, pp.377-384, 2008.
DOI : 10.1145/1366219.1366222

S. A. Touati, J. Worms, and S. Briais, The Speedup-Test: a statistical methodology for programme speedup analysis and computation, To appear in the Journal of Concurrency and Computation: Practice and Experience, 2012.
DOI : 10.1002/cpe.2939

E. Z. Zhang, Y. Jiang, and X. Shen, Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs?, Proc. of the ACM SIGPLAN Symposium on Principles and practice of parallel programming, pp.203-212, 2010.