Data quality in ETL process: A preliminary study - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2019

Data quality in ETL process: A preliminary study

Résumé

The accuracy and relevance of Business Intelligence & Analytics (BI&A) rely on the ability to bring high data quality to the data warehouse from both internal and external sources using the ETL process. The latter is complex and time-consuming as it manages data with heterogeneous content and diverse quality problems. Ensuring data quality requires tracking quality defects along the ETL process. In this paper, we present the main ETL quality characteristics. We provide an overview of the existing ETL process data quality approaches. We also present a comparative study of some commercial ETL tools to show how much these tools consider data quality dimensions. To illustrate our study, we carry out experiments using an ETL dedicated solution (Talend Data Integration) and a data quality dedicated solution (Talend Data Quality). Based on our study, we identify and discuss quality challenges to be addressed in our future research.
Fichier principal
Vignette du fichier
1-s2.0-S1877050919314097-main.pdf (1.06 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Loading...

Dates et versions

hal-02424279 , version 1 (09-01-2020)

Licence

Paternité - Pas d'utilisation commerciale - Pas de modification

Identifiants

Citer

Manel Souibgui, Faten Atigui, Saloua Zammali, Samira Si-Said Cherfi, Sadok Ben Yahia. Data quality in ETL process: A preliminary study. 23rd International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, Sep 2019, Budapest, Hungary. pp.676-687, ⟨10.1016/j.procs.2019.09.223⟩. ⟨hal-02424279⟩
283 Consultations
471 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More