Maximally Informative k-Itemset Mining from Massively Distributed Data Streams

Mehdi Zitouni 1, 2 Reza Akbarinia 2 Sadok Ben Yahia 1 Florent Masseglia 2
2 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : We address the problem of mining maximally informative k-itemsets (miki) in data streams based on joint entropy. We propose PentroS, a highly scalable parallel miki mining algorithm. PentroS renders the mining process of large volumes of incoming data very efficient. It is designed to take into account the continuous aspect of data streams, particularly by reducing the computations of need for updating the miki results after arrival/departure of transactions to/from the sliding window. PentroS has been extensively evaluated using massive real-world data streams. Our experimental results confirm the effectiveness of our proposal which allows excellent throughput with high itemset length.
Type de document :
Communication dans un congrès
SAC: Symposium on Applied Computing, Apr 2018, Pau, France. 33rd ACM/SIGAPP Symposium On Applied Computing, pp.1-10, 2018, 〈https://www.sigapp.org/sac/sac2018/〉
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01711990
Contributeur : Florent Masseglia <>
Soumis le : lundi 19 février 2018 - 10:25:35
Dernière modification le : mercredi 21 novembre 2018 - 20:33:48
Document(s) archivé(s) le : dimanche 6 mai 2018 - 13:15:01

Fichier

ACM_SAC_2018.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : hal-01711990, version 1

Collections

Citation

Mehdi Zitouni, Reza Akbarinia, Sadok Ben Yahia, Florent Masseglia. Maximally Informative k-Itemset Mining from Massively Distributed Data Streams. SAC: Symposium on Applied Computing, Apr 2018, Pau, France. 33rd ACM/SIGAPP Symposium On Applied Computing, pp.1-10, 2018, 〈https://www.sigapp.org/sac/sac2018/〉. 〈hal-01711990〉

Partager

Métriques

Consultations de la notice

226

Téléchargements de fichiers

145