Skip to Main content Skip to Navigation
Conference papers

Análise do Impacto da Replicação de Dados Implementada pelo Apache Hadoop no Balanceamento de Carga

Abstract : Big Data processing tools, such as Apache Hadoop, should ensure data integrity and availability through fault tolerance mechanisms. The HDFS, Hadoop Distributed File System, implements several fault tolerance techniques, among them the traditional data replication. To deal with highly scalable clusters, there is a concern in validate if the replicated data is spread homogeneously among the computational nodes. In this paper, we analyze experimentally the behavior of HDFS in scenarios with and without the occurrence of failures in order to collect metrics of load balancing regarding the process of data replication adopted by Apache Hadoop. Additional experiments measure the performance achieved by balancing a cluster.
Complete list of metadata

Cited literature [6 references]  Display  Hide  Download

https://hal.archives-ouvertes.fr/hal-02414363
Contributor : Rhauani Fazul <>
Submitted on : Monday, December 16, 2019 - 3:52:05 PM
Last modification on : Friday, January 29, 2021 - 8:26:43 PM
Long-term archiving on: : Tuesday, March 17, 2020 - 7:52:55 PM

File

14355-38970-1-SM.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : hal-02414363, version 1

Collections

Citation

Rhauani Weber Aita Fazul, Paulo Vinicius Cardoso, Patricia Pitthan Barcelos. Análise do Impacto da Replicação de Dados Implementada pelo Apache Hadoop no Balanceamento de Carga. Anais do X Computer on the Beach (CotB 2019), Universidade do Vale do Itajaí (UNIVALI), Apr 2019, Florianópolis, SC, Brazil. ⟨hal-02414363⟩

Share

Metrics

Record views

74

Files downloads

253