Shortest Processing Time First and Hadoop

Abstract : Big data has revealed itself as a powerful tool for many sectors ranging from science to business. Distributed data-parallel computing is then common nowadays: using a large number of computing and storage resources makes possible data processing of a yet unknown scale. But to develop large-scale distributed big data processing, one have to tackle many challenges. One of the most complex is scheduling. As it is known to be an optimal online scheduling policy when it comes to minimize the average flowtime, Shortest Processing Time First (SPT) is a classic scheduling policy used in many systems. We then decided to integrate this policy into Hadoop, a framework for big data processing, and realize an implementation prototype. This paper describes this integration, as well as tests results obtained on our testbed.
Complete list of metadatas

Cited literature [9 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-01308183
Contributor : Patrick Martineau <>
Submitted on : Wednesday, April 27, 2016 - 12:38:01 PM
Last modification on : Friday, October 18, 2019 - 1:32:50 AM
Long-term archiving on : Thursday, July 28, 2016 - 10:35:27 AM

File

bare_conf.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01308183, version 1

Citation

Laurent Bobelin, Patrick Martineau, Haiwu He. Shortest Processing Time First and Hadoop. 3rd IEEE International Conference on Cyber Security and Cloud Computing (CSCloud 2016), Jun 2016, Pékin, China. ⟨hal-01308183⟩

Share

Metrics

Record views

320

Files downloads

660