Efficient and versatile FPGA acceleration of support counting for stream mining of sequences and frequent itemsets

Abstract : Stream processing has become extremely popular for analyzing huge volumes of data for a variety of applications, including IoT, social networks, retail, and software logs analysis. Streams of data are produced continuously, and are mined to extract patterns characterizing the data. A class of data mining algorithm, called generate-and-test, produces a set of candidate patterns that are then evaluated over data. The main challenges of these algorithms are to achieve high throughput, low latency and reduced power consumption. In this paper, we present a novel power-eff cient, fast, and versatile hardware architecture whose objective is to monitor a set of target patterns in order to maintain their frequency over a stream of data. This accelerator can be used to accelerate data mining algorithms including itemsets and sequences mining. The massive fine-grain reconfiguration capability of FPGA technologies is ideal to implement the high number of pattern detection units needed for these intensive data mining applications. We have thus designed and implemented an IP that features high-density FPGA occupation and high working frequency. We provide detailed description of the IP internal micro-architecture and its actual implementation and optimization for the targeted FPGA resources. We validate our architecture by developing a co-designed implementation of the Apriori Frequent Itemset Mining (FIM) algorithm, and perform numerous experiments against existing hardware and software solutions. We demonstrate that FIM hardware acceleration is particularly efficient for large and low-density datasets (i.e. long-tailed datasets). Our IP reaches a data throughput of 250 million items/s and monitors up to 11.6k patterns simultaneously, on a prototyping board that overall consumes 24W in the worst case. Furthermore, our hardware accelerator remains generic and can be integrated to other generate and test algorithms.
Type de document :
Article dans une revue
ACM Transactions on Reconfigurable Technology and Systems (TRETS), ACM, 2017, ACM Transactions on Reconfigurable Technology and Systems (TRETS), 10 (3), pp.21. 〈http://dl.acm.org/citation.cfm?id=3027485〉. 〈10.1145/3027485〉
Liste complète des métadonnées

Littérature citée [26 références]  Voir  Masquer  Télécharger

https://hal.archives-ouvertes.fr/hal-01474234
Contributeur : Adrien Prost-Boucle <>
Soumis le : mercredi 22 février 2017 - 16:08:02
Dernière modification le : mercredi 31 mai 2017 - 12:02:44
Document(s) archivé(s) le : mardi 23 mai 2017 - 14:20:19

Fichier

paper-hal.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité - Pas de modifications 4.0 International License

Identifiants

Collections

Citation

Adrien Prost-Boucle, Frédéric Pétrot, Vincent Leroy, Hande Alemdar. Efficient and versatile FPGA acceleration of support counting for stream mining of sequences and frequent itemsets. ACM Transactions on Reconfigurable Technology and Systems (TRETS), ACM, 2017, ACM Transactions on Reconfigurable Technology and Systems (TRETS), 10 (3), pp.21. 〈http://dl.acm.org/citation.cfm?id=3027485〉. 〈10.1145/3027485〉. 〈hal-01474234〉

Partager

Métriques

Consultations de
la notice

528

Téléchargements du document

145