, Livermore

, NAS parallel benchmarks applications (NPB). Tech. rep., NASA Advanced Supercomputing Division

D. Barthou, A. C. Rubial, W. Jalby, S. Koliai, and C. Valensi, Performance tuning of x86 openmp codes with maqao, Tools for High Performance Computing, pp.95-113, 2009.

D. Bohme, M. Geimer, F. Wolf, and L. Arnold, Identifying the root causes of wait states in large-scale parallel applications, 2010 39th International Conference on Parallel Processing, pp.90-100, 2010.

A. Calotoiu, T. Hoefler, M. Poke, and F. Wolf, Using automated performance modeling to find scalability bugs in complex codes, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p.45, 2013.

C. Coarfa, J. M. Mellor-crummey, N. Froyd, and Y. Dotsenko, Scalability analysis of SPMD codes using expectations, Proceedings of the 21th Annual International Conference on Supercomputing, pp.13-22, 2007.

K. Coulomb, A. Degomme, M. Faverge, and F. Trahay, An open-source tool-chain for performance analysis, Tools for High Performance Computing, pp.37-48, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00707236

M. Ghane, A. M. Malik, B. Chapman, and A. Qawasmeh, False sharing detection in openmp applications using ompt api, International Workshop on OpenMP, pp.102-114, 2015.

R. Guerraoui, H. Guiroux, R. Lachaize, V. Quéma, and V. Trigonakis, Lock-unlock: Is that all? a pragmatic analysis of locking in software systems, ACM Transactions on Computer Systems (TOCS), vol.36, issue.1, p.1, 2019.
URL : https://hal.archives-ouvertes.fr/hal-02084060

K. A. Huck, A. D. Malony, S. Shende, and D. W. Jacobsen, Integrated measurement for cross-platform openmp performance analysis, Using and Improving OpenMP for Devices, Tasks, and More -10th International Workshop on OpenMP, pp.146-160, 2014.

C. Iwainsky, S. Shudler, A. Calotoiu, A. Strube, M. Knobloch et al., How many threads will be too many? on the scalability of openmp implementations, European Conference on Parallel Processing, pp.451-463, 2015.

I. Karlin, J. Keasler, and J. Neely, Lulesh 2.0 updates and changes, 2013.

A. Knüpfer, C. Rössel, D. Mey, S. Biersdorff, K. Diethelm et al., Score-p: A joint performance measurement run-time infrastructure for periscope, scalasca, tau, and vampir, Tools for High Performance Computing, pp.79-91, 2011.

M. S. Müller, A. Knüpfer, M. Jurenz, M. Lieber, H. Brunst et al., Developing scalable applications with vampir, vampirserver and vampirtrace, Parallel Computing (PARCO), vol.15, pp.637-644, 2007.

B. Putigny, B. Goglin, and D. Barthou, A benchmark-based performance model for memory-bound hpc applications, 2014 International Conference on High Performance Computing & Simulation (HPCS), pp.943-950, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00985598

J. Reinders, Vtune performance analyzer essentials, 2005.

M. Woodyard, An experimental model to analyze openmp applications for system utilization, International Workshop on OpenMP, pp.22-36, 2011.