Above the clouds: A berkeley view of cloud computing, EECS Department, 2009. ,
Virtual Machines: Versatile Platforms For Systems And Processes, 2005. ,
Enforcing Performance Isolation Across Virtual Machines in Xen, Proceedings of 7th ACM/IFIP/USENIX Int'l Conf. on Middleware (Middleware'06), pp.342-362, 2006. ,
DOI : 10.1145/956993.956995
Cloud Computing for parallel Scientific HPC Applications: Feasibility of Running Coupled Atmosphere-Ocean Climate Models on Amazon's EC2, Computability and Complexity in Analysis (CAA'08), 2008. ,
Eucalyptus: an open-source cloud computing infrastructure, Journal of Physics: Conference Series, pp.1-14, 2009. ,
DOI : 10.1088/1742-6596/180/1/012051
Extending stability beyond CPU millennium, Proceedings of the 2007 ACM/IEEE conference on Supercomputing , SC '07, pp.1-58, 2007. ,
DOI : 10.1145/1362622.1362700
More Google cluster data Google research blog, 2011. ,
Google cluster-usage traces: format + schema, Google Inc, 2011. ,
Towards understanding heterogeneous clouds at scale: Google trace analysis. Intel science and technology center for cloud computing, 2012. ,
Characterization and Comparison of Cloud versus Grid Workloads, 2012 IEEE International Conference on Cluster Computing, pp.230-238, 2012. ,
DOI : 10.1109/CLUSTER.2012.35
Monetary Cost-Aware Checkpointing and Migration on Amazon Cloud Spot Instances, IEEE Trans. on Services Computing, pp.512-524, 2012. ,
DOI : 10.1109/TSC.2011.44
URL : https://hal.archives-ouvertes.fr/hal-00788761
Managing Descheduling Risk in the Google Cloud ,
A higher order estimate of the optimum checkpoint interval for restart dumps, Future Generation Computer Systems, vol.22, issue.3, pp.303-312, 2006. ,
DOI : 10.1016/j.future.2004.11.016
Optimization of checkpointing-related I/O for??high-performance parallel and distributed computing, The Journal of Supercomputing, vol.35, issue.1, pp.150-180, 2008. ,
DOI : 10.1007/s11227-007-0162-0
A Flexible Checkpoint/Restart Model in Distributed Systems, Proceedings of the 8th international conference on Parallel processing and applied mathematics (PPAM'10), pp.206-215, 2010. ,
DOI : 10.1007/978-3-642-14390-8_22
URL : https://hal.archives-ouvertes.fr/hal-00788926
A first order approximation to the optimum checkpoint interval, Communications ACM, pp.530-531, 1974. ,
DOI : 10.1145/361147.361115
Xen and the art of virtualization, Proceedings of the 19th ACM symposium on Operating systems principles (SOSP '03, pp.164-177, 2003. ,
MapReduce, Communications of the ACM, vol.51, issue.1, pp.107-113, 2008. ,
DOI : 10.1145/1327452.1327492
Berkeley lab checkpoint/restart (BLCR) for Linux clusters, Journal of Physics: Conference Series, p.494, 2006. ,
DOI : 10.1088/1742-6596/46/1/067
Web search for a planet: the google cluster architecture, IEEE Micro, vol.23, issue.2, pp.22-28, 2003. ,
DOI : 10.1109/MM.2003.1196112
Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression, Proceedings of 24th International Conference on Neural Information Processing Systems (NIPS'10), pp.1-9, 2010. ,
BlobCR, Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis on, SC '11, pp.1-3412, 2011. ,
DOI : 10.1145/2063384.2063429
URL : https://hal.archives-ouvertes.fr/inria-00601865
Error-Tolerant Resource Allocation and Payment Minimization for Cloud System, IEEE Transactions on Parallel and Distributed Systems, vol.24, issue.6, pp.1097-1106, 2013. ,
DOI : 10.1109/TPDS.2012.309
On the Execution of Large Batch Programs in Unreliable Computing Systems, IEEE Transactions on Software Engineering, vol.10, issue.4, pp.444-450, 1984. ,
DOI : 10.1109/TSE.1984.5010258
Stochastic models for checkpointing. in Stochastic Models for Fault Tolerance, pp.177-236, 2010. ,
Optimum retrial number of reliability models. in Advanced Reliability Models and Maintenance Policies, Series in Reliability Engineering, pp.101-122, 2008. ,
Fault Tolerant Approaches in Cloud Computing Infrastructures, Proceedings of the 8th International Conference on Autonomic and Autonomous Systems (ICAS'12), pp.42-48, 2012. ,
Building Fault-Tolerant Applications on AWS, Tech. Rep, 2011. ,