COMPUTING DATA QUALITY INDICATORS ON BIG DATA STREAMS USING A CEP - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2015

COMPUTING DATA QUALITY INDICATORS ON BIG DATA STREAMS USING A CEP

Wenlu Yang
  • Fonction : Auteur
  • PersonId : 1031724
Alzennyr Gomes da Silva
  • Fonction : Auteur
  • PersonId : 865932
EDF
Marie-Luce Picard
  • Fonction : Auteur
EDF

Résumé

Big Data is often referred to as the 3Vs: Volume, Velocity and Variety. A 4th V (validity) was introduced to address the quality dimension. Poor data quality can be costly, lead to breaks in processes and invalidate the company's efforts on regulatory compliance. In order to process data streams in real time, a new technology called CEP (complex event processing) was developed. In France, the current deployment of smart meters will generate massive electricity consumption data. In this work, we developed a diagnostic approach to compute generic quality indicators of smart meter data streams on the fly. This solution is based on Tibco StreamBase CEP. Visu-alization tools were also developed in order to give a better understanding of the interrelation between quality issues and geographical/temporal dimensions. According to the application purpose, two visualization methods can be loaded: (1) StreamBase LiveView is used to visualize quality indicators in real time; and (2) a Web application provides a posteri-ori and geographical analysis of the quality indicators which are plotted on a map within a color scale (lighter colors indicate good quality and darker colors indicate poor quality). In future works, new quality indicators could be added to the solution which can be applied in an operational context in order to monitor data quality from smart meters.
Fichier principal
Vignette du fichier
final version.pdf (1.02 Mo) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-01367862 , version 1 (19-09-2016)

Identifiants

Citer

Wenlu Yang, Alzennyr Gomes da Silva, Marie-Luce Picard. COMPUTING DATA QUALITY INDICATORS ON BIG DATA STREAMS USING A CEP. Computational Intelligence for Multimedia Understanding (IWCIM), 2015 International Workshop on, Oct 2015, Prague, Czech Republic. ⟨10.1109/IWCIM.2015.7347061⟩. ⟨hal-01367862⟩
180 Consultations
206 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More