&. Site,

J. M. Cebrian, L. Natvig, and J. C. Meyer, Improving Energy Efficiency through Parallelization and Vectorization on Intel Core i5 and i7 Processors, Int. Conf for High Performance Computing, Networking Storage and Analysis (SC), pp.675-684, 2012.

, Intel ® xeon ® processor e5 v4 product family. thermal mechanical specification and design guide, 2017.

, Intel ® xeon ® processor e5 v3 product family. specification update, 2017.

G. Lento, Optimizing performance with intel ® advanced vector extensions, 2014.

D. Balouek, Adding virtualization capabilities to the Grid'5000 testbed, Cloud Computing and Services Science, ser. Communications in Computer and Information Science, vol.367, pp.3-20, 2013.
URL : https://hal.archives-ouvertes.fr/hal-00946971

, Intel ® xeon ® processor scalable family. specification update, 2018.

, Intel ® xeon ® processor e5 v4 product family. specification update, 2016.

J. Treibig, G. Hager, and G. Wellein, LIKWID: A Lightweight Performance-Oriented Tool Suite for x86 Multicore Environments, Int. Conf. on Parallel Processing (ICPP Workshops), pp.207-216, 2010.

D. Hackenberg, R. Schne, T. Ilsche, D. Molka, J. Schuchart et al., An Energy Efficiency Feature Survey of the Intel Haswell Processor, IEEE Int. Parallel and Distributed Processing Symposium (IPDPS Workshops), pp.896-904, 2015.

S. Desrochers, C. Paradis, and V. M. Weaver, A Validation of DRAM RAPL Power Measurements, International Symposium on Memory Systems (MEMSYS), pp.455-470, 2016.

K. Khan, M. Hirki, T. Niemi, J. Nurminen, and Z. Ou, RAPL in Action: Experiences in Using RAPL for Power Measurements, ACM Trans. on Modeling and Performance Evaluation of Computing Systems (TOMPECS), vol.3, 2018.

A. Petitet, R. C. Whaley, J. Dongarra, and A. Cleary, Hpl -a portable implementation of the high-performance linpack benchmark for distributed-memory computers, 2000.

A. Haidar, J. Kurzak, and P. Luszczek, An improved parallel singular value algorithm and its implementation for multicore hardware, Int. Conf. for High Performance Computing, Networking, Storage and Analysis (SC), 2013.

A. Cassagne, T. Tonnellier, C. Leroux, B. L. Gal, O. Aumage et al., Beyond Gbps Turbo decoder on multi-core CPUs, Int. Symp. on Turbo Codes and Iterative Information Processing, pp.136-140, 2016.
URL : https://hal.archives-ouvertes.fr/hal-01363980

L. Noé, M. Gîrdea, and G. Kucherov, Seed design framework for mapping solid reads, Research in Computational Molecular Biology, pp.384-396, 2010.

A. Buttari, J. Langou, J. Kurzak, and J. Dongarra, A class of parallel tiled linear algebra algorithms for multicore architectures, Parallel Computing, vol.35, issue.1, pp.38-53, 2009.

J. D. Mccalpin, Memory bandwidth and machine balance in current high performance computers, IEEE Computer Society Technical Committee on Computer Architecture (TCCA) Newsletter, pp.19-25, 1995.

H. Inoue, How SIMD width affects energy efficiency: A case study on sorting, IEEE Symp. in Low-Power and High-Speed Chips (COOL CHIPS), 2016.

H. Chen, G. Grider, J. Inman, P. Fields, and J. A. Kuehn, An empirical study of performance, power consumption, and energy cost of erasure code computing for HPC cloud storage systems, 2015 IEEE International Conference on Networking, Architecture and Storage (NAS), pp.71-80, 2015.

J. Schuchart, D. Hackenberg, R. Schöne, T. Ilsche, R. Nagappan et al., The Shift from Processor Power Consumption to Performance Variations: Fundamental Implications at Scale, Comput. Sci, vol.31, issue.4, pp.197-205, 2016.

T. Jakobs and G. Rünger, On the energy consumption of load/store AVX instructions, Federated Conference on Computer Science and Information Systems (FedCSIS), pp.319-327, 2018.

T. Jakobs and G. Runger, Examining energy efficiency of vectorization techniques using a gaussian elimination, Int. Conference on High Performance Computing Simulation (HPCS), pp.268-275, 2018.

P. Gepner, V. Gamayunov, and D. L. Fraser, Early performance evaluation of AVX for HPC, International Conference on Computational Science (ICCS), vol.4, pp.452-460, 2011.

A. Mazouz, D. C. Wong, D. Kuck, and W. Jalby, An incremental methodology for energy measurement and modeling, ACM/SPEC Int. Conf. on Performance Engineering, pp.15-26, 2017.
URL : https://hal.archives-ouvertes.fr/hal-01788299

J. M. Cebrián, L. Natvig, and J. C. Meyer, Performance and energy impact of parallelization and vectorization techniques in modern microprocessors, Computing, vol.96, issue.12, pp.1179-1193, 2014.