A. Abd-+-18]-guillaume-aupy, S. Benoit, L. Dai, P. Pottier, Y. Raghavan et al., Co-scheduling Amdahl applications on cache-partitioned systems. The Int, Journal of High Performance Computing Applications, vol.32, issue.1, pp.123-138, 2018.

G. , The validity of the single processor approach to achieving large scale computing capabilities, AFIPS Conference Proceedings, pp.483-485, 1967.

]. D. +-91 and . Bailey, The NAS Parallel Benchmarks -Summary and Preliminary Results, Proc. of the 1991 ACM/IEEE Conf. on Supercomputing, 1991.

C. Baa-+-16-;-andrew, H. Bauer, J. Abbasi, H. Ahrens, B. Childs et al., situ methods, infrastructures, and applications on high performance computing platforms, vol.35, pp.577-597, 2016.

B. D. Bui, M. Caccamo, L. Sha, and J. Martinez, Impact of cache partitioning on multitasking real time embedded systems, 4th IEEE Int. Conf. on Embedded and Real-Time Computing Systems and Applications, pp.101-110, 2008.

. Bdg-+-00]-shirley, J. Browne, N. Dongarra, G. Garner, P. Ho et al., A portable programming interface for performance evaluation on modern processors. The international journal of high performance computing applications, vol.14, pp.189-204, 2000.

. Bhp-+-17]-shunxing, Y. Bao, P. Huo, . Parvathaneni, J. Andrew et al., A data colocation grid framework for big data medical image processing-backend design, 2017.

. Pezy-computing, Zettascaler-2.0 configurable liquid immersion cooling system, 2017.

M. Dreher and B. Raffin, A Flexible Framework for Asynchronous In Situ and In Transit Analytics for Scientific Simulations, 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 2014.
URL : https://hal.archives-ouvertes.fr/hal-00941413

E. Strohmaier, The top500 benchmark, 2017.

N. Guan, M. Stigge, W. Yi, and G. Yu, Cache-aware scheduling and analysis for multicores, Proc. 7th ACM Int. Conf. Embedded Software, EMSOFT '09, pp.245-254, 2009.

A. Hartstein, V. Srinivasan, T. Puzak, and P. Emma, On the nature of cache miss behavior: Is it ? 2, The Journal of Instruction-Level Parallelism, vol.10, pp.1-22, 2008.

S. Kim, D. Chandra, and Y. Solihin, Fair cache sharing and partitioning in a chip multiprocessor architecture, Proceedings of the 13th International Conference on Parallel Architectures and Compilation Techniques, pp.111-122, 2004.

A. Krishna, A. Samih, and Y. Solihin, Data sharing in multi-threaded applications and its impact on chip design, Int. Symp. Performance Analysis of Systems and Software (ISPASS), pp.125-134, 2012.

L. Lcg-+-16]-david-lo, R. Cheng, . Govindaraju, C. Parthasarathy-ranganathan, and . Kozyrakis, Improving resource efficiency at scale with Heracles, ACM Transactions on Computer Systems (TOCS), vol.34, issue.2, 2016.

J. Leverich and C. Kozyrakis, Reconciling high server utilization and submillisecond quality-of-service, 9th European Conf. on Computer Systems, 2014.

Q. Lld-+-08]-jiang-lin, X. Lu, Z. Ding, X. Zhang, P. Zhang et al., Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems, High Performance Computer Architecture, pp.367-378, 2008.

L. Msm-+-11]-sai-prashanth-muralidhara, O. Subramanian, M. Mutlu, T. Kandemir, and . Moscibroda, Reducing memory interference in multicore systems via applicationaware memory channel partitioning, Proc. 44th IEEE/ACM Int. Sym. Microarchitecture, MICRO-44, pp.374-385, 2011.

V. Mvm-+-15]-preeti-malakar, T. Vishwanath, C. Munson, M. Knight, S. Hereld et al., Optimal scheduling of in-situ analysis for large-scale scientific simulations, Proc. of the Int. Conf. for High Performance Computing, Networking, Storage and Analysis, SC'15, 2015.

T. Khang and . Nguyen, Introduction to Cache Allocation Technology in the Intel R Xeon R Processor E5 v4 Family, 2016.

J. Kyle, J. Nesbit, J. E. Laudon, and . Smith, Virtual private caches, ACM SIGARCH Computer Architecture News, vol.35, issue.2, pp.57-68, 2007.

K. Moinuddin, Y. N. Qureshi, and . Patt, Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches, Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on, pp.423-432, 2006.

M. Brian, A. Rogers, . Krishna, B. Gordon, K. Bell et al., Scaling the bandwidth wall: challenges in and avenues for CMP scaling, ACM SIGARCH Computer Architecture News, vol.37, issue.3, pp.371-382, 2009.

C. Sewell, Large-scale compute-intensive analysis via a combined in-situ and co-scheduling workflow approach, Proc. of the Int. Conf. for High Perf. Computing, Networking, Storage and Analysis, SC'15, 2015.

D. Tam, R. Azimi, L. Soares, and M. Stumm, Managing shared l2 caches on multicore systems in software, Workshop on the Interaction between Operating Systems and Computer Architecture, pp.26-33, 2007.

G. Taylor, P. Davies, and M. Farmwald, The tlb slice-a low-cost high-speed address translation mechanism, Computer Architecture, 1990. Proceedings., 17th Annual International Symposium on, pp.355-363, 1990.

K. Tian, Y. Jiang, and X. Shen, A study on optimally co-scheduling jobs of different lengths on chip multiprocessors, Proc. 6th ACM Conf. Computing Frontiers, CF '09, pp.41-50, 2009.

S. Zhuravlev, S. Blagodurov, and A. Fedorova, Addressing shared resource contention in multicore processors via scheduling, ACM Sigplan Notices, vol.45, issue.3, pp.129-142, 2010.

Y. Zhang, A. Michael, J. Laurenzano, L. Mars, and . Tang, Smite: Precise QOS prediction on real-system SMT processors to improve utilization in warehouse scale computers, Proc. of the 47th Int. Symp. on Microarchitecture, pp.406-418, 2014.

. Zsb-+-12]-sergey, J. C. Zhuravlev, S. Saez, A. Blagodurov, M. Fedorova et al., Survey of scheduling techniques for addressing shared resources in multicore processors, ACM Computing Surveys (CSUR), vol.45, issue.1, p.4, 2012.