Emergent pattern detection algorithm for big data streams - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Emergent pattern detection algorithm for big data streams

Lina Fahed
Ayman Alfalou

Résumé

Pattern detection is an active field in big data streams analytics with numerous ongoing challenges. Actually, due to the great velocity and variety of data, new patterns can appear and change over time. Existing state-of-the-art solutions consist in updating the pattern detection model regularly in order to integrate newly appeared and validated patterns. However, in several applications, such as security and defense, patterns can represent anomalies. Therefore, it becomes crucial to detect new patterns (i.e. new anomalies), as early as possible, in order to react at the right moment. Consequently, emergent pattern detection becomes a very challenging task. To tackle this challenge, we propose EPDA (Emergent Pattern Detection Algorithm): a new and validated algorithm for detecting emergent patterns in data streams. The originality of EPDA consists in exploiting frequent pattern mining techniques by proposing new statistical measures in order to estimate the evolution of emergent patterns over time. To perform this detection in a real-time, EPDA runs on the well-known Apache STORM distributed real-time computation system. To better fit our algorithm, we propose a new Apache STORM topology which is composed of one Spouts level and two Bolts levels. Experiments on a real data stream have shown the relevance of the proposed measures and the efficiency of our algorithm in a prediction task and in terms of execution time.
Fichier principal
Vignette du fichier
11400-22-submitted-paper.pdf (421.39 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-02558083 , version 1 (14-05-2020)

Identifiants

Citer

Lina Fahed, Ayman Alfalou. Emergent pattern detection algorithm for big data streams. SPIE. Defense + Commercial Sensing, Pattern Recognition and Tracking XXXI, Apr 2020, California, United States. pp.114000M, ⟨10.1117/12.2558536⟩. ⟨hal-02558083⟩
45 Consultations
359 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More