Optimizing OLAP Cubes Construction by Improving Data Placement on Multi-nodes Clusters

Billel Arres; Nadia Kabachi; Omar Boussaid

doi:10.1109/PDP.2015.45

Communication Dans Un Congrès Année : 2015

Optimizing OLAP Cubes Construction by Improving Data Placement on Multi-nodes Clusters

(1, 2) , (1, 2) , (1, 2)

1
2

Billel Arres

Fonction : Auteur
PersonId : 948703

Equipe de Recherche en Ingénierie des Connaissances

SID

Nadia Kabachi

Fonction : Auteur correspondant
PersonId : 925740

Connectez-vous pour contacter l'auteur

Equipe de Recherche en Ingénierie des Connaissances

SID

Omar Boussaid

Fonction : Auteur correspondant
PersonId : 172986
IdHAL : omar-boussaid
ORCID : 0000-0001-6388-3152
IdRef : 074555081

Connectez-vous pour contacter l'auteur

Equipe de Recherche en Ingénierie des Connaissances

SID

Résumé

The increasing volumes of relational data let us find an alternative to cope with them. The Hadoop framework - which is an open source project based on the MapReduce paradigm - is a popular choice for big data analytics. However, the performance gained from Hadoop’s features is currently limited by its default block placement policy, which does not take any data characteristics into account. Indeed, the efficiency of many operations can be improved by a careful data placement, including indexing, grouping, aggregation and joins. In this paper we propose a data warehouse placement policy to improve query gain performances on multi nodes clusters, especially Hadoop clusters. We investigate the performance gain for OLAP cube construction query with and without data organization. And this, by varying the number of nodes and data warehouse size. It has been found that, the proposed data placement policy has lowered global execution time for building OLAP data cubes up to 20 percent compared to default data placement.

Mots clés

MapReduce HDFS Data warehouses Block Placement

Domaines

Base de données [cs.DB] Calcul parallèle, distribué et partagé [cs.DC]

Fabien Rico : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01166226

Soumis le : lundi 22 juin 2015-13:12:22

Dernière modification le : samedi 25 février 2023-03:53:40

Dates et versions

hal-01166226 , version 1 (22-06-2015)

Identifiants

HAL Id : hal-01166226 , version 1
DOI : 10.1109/PDP.2015.45

Citer

Billel Arres, Nadia Kabachi, Omar Boussaid. Optimizing OLAP Cubes Construction by Improving Data Placement on Multi-nodes Clusters. 23rd EuroPDP International Conference on Parallel, Distributed, and Network-Based P, Mar 2015, Turku, Finland. pp.520 - 524, ⟨10.1109/PDP.2015.45⟩. ⟨hal-01166226⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UNIV-LYON2 ERIC UDL

36 Consultations

0 Téléchargements

Optimizing OLAP Cubes Construction by Improving Data Placement on Multi-nodes Clusters

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager