HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Conference papers

On Exploiting Data Locality for Iterative Mapreduce Applications in Hybrid Clouds

Abstract : Hybrid cloud bursting (i.e., leasing temporary off-premise cloud resources to boost the capacity during peak utilization), has made significant impact especially for big data analytics, where the explosion of data sizes and increasingly complex computations frequently leads to insufficient local data center capacity. Cloud bursting however introduces a major challenge to runtime systems due to the limited throughput and high latency of data transfers between on-premise and off-premise resources (weak link). This issue and how to address it is not well understood. We contribute with a comprehensive study on what challenges arise in this context, what potential strategies can be applied to address them and what best practices can be leveraged in real-life. Specifically, we focus our study on iterative MapReduce applications , which are a class of large-scale data intensive applications particularly popular on hybrid clouds. In this context, we study how data locality can be leveraged over the weak link both from the storage layer perspective (when and how to move it off-premise) and from the scheduling perspective (when to compute off-premise). We conclude with a brief discussion on how to set up an experimental framework suitable to study the effectiveness of our proposal in future work.
Complete list of metadata

Cited literature [26 references]  Display  Hide  Download

Contributor : Bogdan Nicolae Connect in order to contact the contributor
Submitted on : Friday, February 24, 2017 - 3:12:02 PM
Last modification on : Thursday, June 1, 2017 - 4:27:34 PM


Files produced by the author(s)



Francisco Clemente-Castello, Bogdan Nicolae, Rafael Mayo, Juan Carlos Fernandez, M. Mustafa Rafique. On Exploiting Data Locality for Iterative Mapreduce Applications in Hybrid Clouds. BDCAT'16: 3rd IEEE/ACM International Conference on Big Data Computing, Applications and Technologies, Dec 2016, Shanghai, China. pp.118 - 122, ⟨10.1145/3006299.3006329⟩. ⟨hal-01476052⟩



Record views


Files downloads