Overview of the proposed simulation methodology ,
Evolution of the power efficiency in the GREEN500 ,
, Evolution of the inclusion of accelerators in the GREEN500
,
,
, Instantiation of the energy model used in chapters 6 and 7, p.14
Portion of a Batsim simulation sequence diagram, p.25 ,
, Mean bounded slowdown and makespan of all workload executions, p.37
, Mean waiting time and makespan of all workload executions, p.38
, Mean slowdown difference (real-simulated) for all workloads, p.39
, Mean slowdown difference (real-simulated) distribution, p.40
, Mean waiting time difference (real-simulated) distribution, p.41
, 8 Final section of Gantt charts coming from our evaluation process, p.42
, Makespan against communication factor (homogeneous experiment), p.52
, , p.53
, Makespan against communication factor (heterogeneous experiment), p.55
, Figuration of the main idea behind the proposed algorithm, p.64
, Normalized mean utilization against energy budget, p.71
, Performance/energy trade-offs against energy budget, p.75
, Energy against mean waiting time for best trade-off solutions, p.91
, Energy against max waiting time for best trade-off solutions, p.92
, Energy against number of switches for best trade-off solutions, p.93
, Energy against mean waiting for all trade-off solutions, vol.94, p.1
Energy saving opportunities over time ,
Most frequent types of months ,
98 List of Tables 5.1 The parameters of the clusters used in heterogeneous experiments, p.54 ,
68 6.2 Average improvements when opportunistic shutdown is enabled, p.74 ,
The experimental process parameter space, Bataar on Github, p.34 ,
Batsimctn Project on the Inria Forge, p.39 ,
, Project on the Inria Forge, p.41, 2016.
, , vol.30, p.44
, , vol.24, p.88
, Batsim Protocol Description on Github, vol.26, p.27
Batsched Gitlab Repository, vol.86, p.88 ,
, Evalys Project on Github, p.41
, Project on the Inria Forge, p.40, 2016.
, , p.35
, , p.35, 2016.
, , vol.5, p.6, 2017.
, Supercomputers Take Big Green Leap in 2017, p.5, 2017.
, Kamelot on Github, p.34
Grid5000 Nancy Clusters Description, p.34, 2016. ,
, Piz Daint supercomputer description, vol.2, p.5, 2017.
, Dror Feitelson. Parallel Workload Archive, p.89
, , p.68
, Artifacts to reproduce the, Towards Energy Budget Control in HPC, p.68
Artifacts to reproduce the "Performance vs Energy Tradeoffs via Shutdown Policies in EASY Backfilling, p.87 ,
, Tianhe-2 supercomputer description, p.5, 2017.
, , p.1, 2017.
Flux: A next-generation resource management framework for large hpc centers, Parallel Processing Workshops (ICCPW), p.102, 2014. ,
Energy-efficient algorithms, Communications of the ACM, vol.53, p.98, 2010. ,
Opportunities and Challenges of Exascale Computing, Tech. rep. U.S. Department of Energy, p.2, 2010. ,
Adding Virtualization Capabilities to the Grid'5000 Testbed, Cloud Computing and Services Science, vol.367, p.68, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00946971
Energy-Aware Scheduling for Real-Time Systems: A Survey, ACM Transactions on Embedded Computing Systems (TECS), vol.15, p.61, 2016. ,
Electrical Grid and Supercomputing Centers: An Investigative Analysis of Emerging Opportunities and Challenges, Informatik-Spektrum, vol.38, p.57, 2015. ,
Reducing the energy consumption of large scale computing systems through combined shutdown policies with multiple constraints, International Journal of High Performance Computing Applications, p.99, 2017. ,
URL : https://hal.archives-ouvertes.fr/hal-01557025
The case for energy-proportional computing, Computer 40, vol.12, p.61, 2007. ,
Borg, omega, and kubernetes, Communications of the ACM, vol.59, p.3, 2016. ,
A batch scheduler with high level components, Cluster Computing and the Grid, vol.2, p.33, 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-00005106
Versatile, Scalable, and Accurate Simulation of Distributed Applications and Platforms, Journal of Parallel and Distributed Computing, vol.74, p.48, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01017319
Simbatch: An API for simulating and predicting the performance of parallel resources managed by batch systems, Euro-Par 2008 Workshops-Parallel Processing, p.24, 2009. ,
Managing energy and server resources in hosting centers, ACM SIGOPS operating systems review, vol.35, p.2, 2001. ,
On the interplay of parallelization, program performance, and energy consumption, IEEE Transactions on Parallel and Distributed Systems, vol.21, p.81, 2010. ,
, Bibliography A5
The international exascale software project roadmap, International Journal of High Performance Computing Applications, vol.25, p.57, 2011. ,
Multi-objective scheduling, Introduction to scheduling, p.80, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-00800427
Towards Energy Budget Control in HPC, Cluster, Cloud and Grid Computing (CCGrid), p.98, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01533417
Batsim: a Realistic Language-Independent Resources and Jobs Management Systems Simulator, Job Scheduling Strategies for Parallel Processing (JSSPP). 2016. cit, p.40 ,
URL : https://hal.archives-ouvertes.fr/hal-01333471
Simulating Power Scheduling at Scale. Tech. rep. Lawrence Livermore National Laboratory (LLNL), p.2, 2017. ,
Parallel job scheduling for power constrained HPC systems, Parallel Computing, vol.38, p.60, 2012. ,
Understanding the future of energy-performance trade-off via DVFS in HPC environments, Journal of Parallel and Distributed Computing, vol.72, p.98, 2012. ,
Metrics for parallel job scheduling and their convergence, Workshop on Job Scheduling Strategies for Parallel Processing, vol.12, p.60, 2001. ,
Workload Modeling for Computer Systems Performance Evaluation, vol.35, p.79, 2015. ,
Resampling with Feedback-A New Paradigm of Using Workload Data for Performance Evaluation, European Conference on Parallel Processing, p.44, 2016. ,
Pitfalls in parallel job scheduling evaluation, Job Scheduling Strategies for Parallel Processing, vol.3834, p.60, 2005. ,
Maintaining a critical attitude towards simulation results (invited talk), vol.2, p.102, 2006. ,
Experience with using the parallel workloads archive, Journal of Parallel and Distributed Computing, vol.74, p.88, 2014. ,
Energy Accounting and Control with SLURM Resource and Job Management System, International Conference on Distributed Computing and Networking (ICDCN), vol.8314, p.59, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01237596
A Scheduler-Level Incentive Mechanism for Energy Efficiency in HPC, 15th IEEE/ACM International Symposium on. IEEE. 2015, vol.2, p.59, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01230295
Adaptive Resource and Job Management for Limited Power Consumption, IEEE International Parallel and Distributed Processing Symposium Workshop, vol.81, p.98, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01230292
Scheduling with Communication Delay, Multiprocessor Scheduling: Theory and Applications, p.46, 2007. ,
URL : https://hal.archives-ouvertes.fr/lirmm-00195552
ScSF: a scheduling simulation framework, 21th Workshop on Job Scheduling Strategies for Parallel Processing, p.102, 2017. ,
From simulation to experiment: a case study on multiprocessor task scheduling, Parallel and Distributed Processing Workshops and Phd Forum, p.2011 ,
DOI : 10.1109/ipdps.2011.201
URL : https://hal.archives-ouvertes.fr/hal-00627842
, IEEE International Symposium on. IEEE, p.47, 2011.
Saving 200kw and $200 k/year by power-aware job/machine scheduling, Parallel and Distributed Processing, p.61, 2008. ,
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center, In: NSDI, vol.11, p.3, 2011. ,
ZeroMQ: messaging for many applications, p.26, 2013. ,
Average and Competitive Analysis of Latency and Power Consumption of a Queuing System with a Sleep Mode, Proceedings of the 3rd International Conference on Future Energy Systems: Where Energy, Computing and Communication Meet. eEnergy '12, vol.14, p.98, 2012. ,
,
One Step towards Bridging the Gap between Theory and Practice in Moldable Task Scheduling with Precedence Constraints, Concurrency and Computation: Practice and Experience, vol.27, p.46, 2015. ,
Communication and topology-aware load balancing in charm++ with treematch, Cluster Computing (CLUSTER), p.47, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00851148
Utility Driven Dynamic Resource Management in an Oversubscribed Energy-Constrained Heterogeneous System, Parallel & Distributed Processing Symposium Workshops (IPDPSW), p.61, 2014. ,
Alea 2: job scheduling simulator, Proceedings of the 3rd International ICST Conference on Simulation Tools and Techniques. ICST (Institute for Computer Sciences, SocialInformatics and Telecommunications Engineering), vol.2, p.23, 2010. ,
Handbook of Scheduling: Algorithms, Models, and Performance Analysis. Chapman & Hall/CRC Computer and Information Science Series, p.46, 2004. ,
, Proceedings of the Workshop on Job Scheduling Strategies for Parallel Processing. IPPS '95, p.47, 1995.
Backfilling with guarantees made as jobs arrive, Concurrency and Computation: Practice and Experience, vol.25, p.100, 2013. ,
Yarnsim: Simulating hadoop yarn, 15th IEEE/ACM International Symposium on. IEEE. 2015, p.3, 2015. ,
, Advanced Scientific Computing Advisory Subcommittee (ASCAC) Report: Top Ten Exascale Research Challenges, p.1, 2014.
Contiguity and Locality in Backfilling Scheduling, Cluster, Cloud and Grid Computing (CCGrid), vol.51, p.59, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01230294
Simulation of batch scheduling using real productionready software tools, Proceedings of the 5th IBERGRID, vol.24, p.102, 2011. ,
MPI+PRV+TIT-traces_NAS ,
Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling, Parallel and Distributed Systems, vol.12, p.100, 2001. ,
An integrated approach to evaluating simulation credibility. Tech. rep. NAVAL AIR WARFARE CENTER WEAPONS DIV CHINA LAKE CA, p.102, 2001. ,
, Confidence intervals from normalized data: A correction to Cousineau, p.70, 2005.
Metascheduling of HPC Jobs in Day-Ahead Electricity Markets, IEEE 22nd International Conference on, p.61, 2015. ,
, , p.34, 2016.
An Automatic Tuning System for Solving NP-Hard Problems in Clouds, 2016 IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops, p.2, 2016. ,
URL : https://hal.archives-ouvertes.fr/hal-01427255
Don't Hurry be Happy: a Deadlinebased Backfilling Approach, Job Scheduling Strategies for Parallel Processing, p.4, 2017. ,
Save Watts in your Grid: Green Strategies for Energy-Aware Framework in Large Scale Distributed Systems, IEEE International Conference on Parallel and Distributed Systems (ICPADS), vol.61, p.99, 2008. ,
URL : https://hal.archives-ouvertes.fr/ensl-00474726
Practical Resource Management in Power-Constrained, High Performance Computing, Proceedings of the 24th International Symposium on HighPerformance Parallel and Distributed Computing, p.60, 2015. ,
, Bibliography A9
Supercomputing Centers and Electricity Service Providers: A Geographically Distributed Perspective on Demand Management in Europe and the United States, International Conference on High Performance Computing, vol.57, p.76, 2016. ,
Economic viability of hardware overprovisioning in power-constrained high performance computing, Proceedings of the 4th International Workshop on Energy Efficient Supercomputing, p.98, 2016. ,
INSEE: An interconnection network simulation and evaluation environment, Euro-Par 2005 Parallel Processing, vol.2, p.24, 2005. ,
Locality-aware policies to improve job scheduling on 3D tori, The Journal of Supercomputing, vol.71, p.24, 2015. ,
Effects of Topology-Aware Allocation Policies on Scheduling Performance, Job Scheduling Strategies for Parallel Processing, 14th International Workshop, p.47, 2009. ,
Beyond DVFS: A first look at performance under a hardware-enforced power bound, Parallel and Distributed Processing Symposium Workshops & PhD Forum (IPDPSW), p.60, 2012. ,
Reconstructable Software Appliances with Kameleon, SIGOPS Oper. Syst. Rev, vol.49, issue.1, p.87, 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01334135
Maximizing throughput of overprovisioned hpc data centers under a strict power budget, Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, vol.60, p.98, 2014. ,
, Task Scheduling for Parallel Systems. Wiley Series on Parallel and Distributed Computing, p.46, 2007.
Power management and dynamic voltage scaling: Myths and facts, Proceedings of the 2005 workshop on power aware real-time computing, vol.12, p.81, 2005. ,
Toward a Realistic Task Scheduling Model, IEEE Trans. Parallel Distrib. Syst, vol.17, issue.3, p.46, 2006. ,
Using and Modifying the BSC Slurm Workload Simulator, p.102, 2015. ,
Apache hadoop yarn: Yet another resource negotiator, Proceedings of the 4th annual Symposium on Cloud Computing. ACM, p.3, 2013. ,
A data driven scheduling approach for power management on HPC systems, High Performance Computing, Networking, Storage and Analysis, SC16: International Conference for, p.103, 2016. ,
Integrating dynamic pricing of electricity into energy aware scheduling for HPC systems, Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, p.61, 2013. ,
A Taxonomy of Scientific Workflow Systems for Grid Computing, SIGMOD Rec, vol.34, p.40, 2005. ,
Slurm: Simple linux utility for resource management, Workshop on Job Scheduling Strategies for Parallel Processing, vol.24, p.102, 2003. ,
Exploring Plan-Based Scheduling for Large-Scale Computing Systems, Cluster Computing (CLUSTER), p.101, 2016. ,
, Bibliography A11
, Additionally, the work conducted in this dissertation directly led to the following communications
, Cluster, Cloud and Grid Computing (CCGrid), 2016.
, IEEE/ACM International Symposium on. IEEE, 2016.
, Peer-reviewed international workshops ? Pierre-François Dutot, Millian Poquet, and Denis Trystram, International European Conference on Parallel and Distributed Computing, 2015.
Batsim: a Realistic Language-Independent Resources and Jobs Management Systems Simulator, Job Scheduling Strategies for Parallel Processing ,
URL : https://hal.archives-ouvertes.fr/hal-01333471