Improved deduplication through parallel binning

Zhike Zhang; Deepavali Bhagwat; Witold Litwin; Darrell D.E. Long; Thomas Schwarz

doi:10.1109/PCCC.2012.6407746

Communication Dans Un Congrès Année : 2012

Improved deduplication through parallel binning

, , (1) , ,

Zhike Zhang

Fonction : Auteur

Deepavali Bhagwat

Fonction : Auteur

Witold Litwin

Fonction : Auteur

Laboratoire d'analyse et modélisation de systèmes pour l'aide à la décision

Darrell D.E. Long

Fonction : Auteur

Thomas Schwarz

Fonction : Auteur

Résumé

Many modern storage systems use deduplication in order to compress data by avoiding storing the same data twice. Deduplication needs to use data stored in the past, but accessing information about all data stored can cause a severe bottleneck. Similarity based deduplication only accesses information on past data that is likely to be similar and thus more likely to yield good deduplication. We present an adaptive deduplication strategy that extends Extreme Binning and investigate theoretically and experimentally the effects of the additional bin accesses.

Mots clés

deduplication optimization

Domaines

Informatique [cs]

Paris Dauphine-PSL Administrateur : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01495371

Soumis le : vendredi 24 mars 2017-19:00:18

Dernière modification le : vendredi 19 avril 2024-16:18:54

Dates et versions

hal-01495371 , version 1 (24-03-2017)

Identifiants

HAL Id : hal-01495371 , version 1
DOI : 10.1109/PCCC.2012.6407746

Citer

Zhike Zhang, Deepavali Bhagwat, Witold Litwin, Darrell D.E. Long, Thomas Schwarz. Improved deduplication through parallel binning. 2012 IEEE 31st International Performance Computing and Communications Conference (IPCCC), Dec 2012, Austin, United States. pp.130-141, ⟨10.1109/PCCC.2012.6407746⟩. ⟨hal-01495371⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-DAUPHINE LAMSADE-DAUPHINE PSL

33 Consultations

0 Téléchargements

Improved deduplication through parallel binning

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager