Data variety, come as you are in multi-model data warehouses - Archive ouverte HAL Accéder directement au contenu
Article Dans Une Revue Information Systems Année : 2022

Data variety, come as you are in multi-model data warehouses

Résumé

Multi-model DBMSs (MMDBMSs) have been recently introduced to store and seamlessly query heterogeneous data (structured, semi-structured, graph-based, etc.) in their native form, aimed at effectively preserving their variety. Unfortunately, when it comes to analyzing these data, traditional data warehouses (DWs) and OLAP systems fall short because they rely on relational DBMSs for storage and querying, thus constraining data variety into the rigidity of a structured, fixed schema. In this paper, we investigate the performances of an MMDBMS when used to store multidimensional data for OLAP analyses. A multi-model DW would store each of its elements according to its native model; among the benefits we envision for this solution, that of bridging the architectural gap between data lakes and DWs, that of reducing the cost for ETL, and that of ensuring better flexibility, extensibility, and evolvability thanks to the combined use of structured and schemaless data. To support our investigation we define a multidimensional schema for the UniBench benchmark dataset and an ad hoc OLAP workload for it. Then we propose and compare three logical solutions implemented on the PostgreSQL multi-model DBMS: one that extends a star schema with JSON, XML, graph-based, and key-value data; one based on a classical (fully relational) star schema; and one where all data are kept in their native form (no relational data are introduced). As expected, the full-relational implementation generally performs better than the multi-model one, but this is balanced by the benefits of MMDBMSs in dealing with variety. Finally, we give our perspective view of the research on this topic.

Dates et versions

hal-03452603 , version 1 (27-11-2021)

Identifiants

Citer

Sandro Bimonte, Enrico Gallinucci, Patrick Marcel, Stefano Rizzi. Data variety, come as you are in multi-model data warehouses. Information Systems, 2022, 104, pp.101734. ⟨10.1016/j.is.2021.101734⟩. ⟨hal-03452603⟩
62 Consultations
0 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More