Skip to Main content Skip to Navigation
Conference papers

Pufferbench: Evaluating and Optimizing Malleability of Distributed Storage

Abstract : Malleability is the property of an application to be dynamically rescaled at run time. It requires the possibility to dynamically add or remove resources to the infrastructure without interruption. Yet, many Big Data applications cannot benefit from their inherent malleability, since their colocated distributed storage system is not malleable in practice. Commissioning or decommissioning storage nodes is generally assumed to be slow, as such operations have typically been designed for maintenance only. New technologies, however, enable faster data transfers. Still, evaluating the performance of rescaling operations on a given platform is a challenge in itself: no tool currently exists for this purpose. We introduce Pufferbench, a benchmark for evaluating how fast one can scale up and down a distributed storage system on a given infrastructure and, thereby, how viably can one implement storage malleability on it. Besides, it can serve to quickly prototype and evaluate mechanisms for malleability in existing distributed storage systems. We validate Pufferbench against theoretical lower bounds for commission and decommis-sion: it can achieve performance within 16% of them. We use Pufferbench to evaluate in practice these operations in HDFS: commission in HDFS could be accelerated by as much as 14 times! Our results show that: (1) the lower bounds for commission and decommission times we previously established are sound and can be approached in practice; (2) HDFS could handle these operations much more efficiently; most importantly, (3) malleability in distributed storage systems is viable and should be further leveraged for Big Data applications.
Complete list of metadatas

Cited literature [16 references]  Display  Hide  Download
Contributor : Gabriel Antoniu <>
Submitted on : Wednesday, October 10, 2018 - 6:30:08 PM
Last modification on : Saturday, July 11, 2020 - 3:15:26 AM


Files produced by the author(s)


  • HAL Id : hal-01892713, version 1


Nathanaël Cheriere, Matthieu Dorier, Gabriel Antoniu. Pufferbench: Evaluating and Optimizing Malleability of Distributed Storage. PDSW-DISCS 2018: 3rd Joint International workshop on Parallel Data Storage & Data Intensive Scalable computing Systems, Nov 2018, Dallas, United States. pp.1-10. ⟨hal-01892713⟩



Record views


Files downloads