Optimizing OLAP Cubes Construction by Improving Data Placement on Multi-nodes Clusters - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

Optimizing OLAP Cubes Construction by Improving Data Placement on Multi-nodes Clusters

Résumé

The increasing volumes of relational data let us find an alternative to cope with them. The Hadoop framework - which is an open source project based on the MapReduce paradigm - is a popular choice for big data analytics. However, the performance gained from Hadoop’s features is currently limited by its default block placement policy, which does not take any data characteristics into account. Indeed, the efficiency of many operations can be improved by a careful data placement, including indexing, grouping, aggregation and joins. In this paper we propose a data warehouse placement policy to improve query gain performances on multi nodes clusters, especially Hadoop clusters. We investigate the performance gain for OLAP cube construction query with and without data organization. And this, by varying the number of nodes and data warehouse size. It has been found that, the proposed data placement policy has lowered global execution time for building OLAP data cubes up to 20 percent compared to default data placement.
Fichier non déposé

Dates et versions

hal-01166226 , version 1 (22-06-2015)

Identifiants

Citer

Billel Arres, Nadia Kabachi, Omar Boussaid. Optimizing OLAP Cubes Construction by Improving Data Placement on Multi-nodes Clusters. 23rd EuroPDP International Conference on Parallel, Distributed, and Network-Based P, Mar 2015, Turku, Finland. pp.520 - 524, ⟨10.1109/PDP.2015.45⟩. ⟨hal-01166226⟩
36 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More