Itemset Approximation Using Constrained Binary Matrix Factorization

Abstract : We address in this paper the problem of efficiently finding a few number of representative frequent itemsets in transaction matrices. To do so, we propose to rely on matrix decomposition techniques, and more precisely on Constrained Binary Matrix Factorization (CBMF) which decomposes a given binary matrix into the product of two lower dimensional binary matrices, called factors. We first show, under binary constraints, that one can interpret the first factor as a transaction matrix operating on packets of items, whereas the second factor indicates which item belongs to which packet. We then formally prove that one can directly mine the CBMF factors in order to find (approximate) itemsets of a given size and support in the original transaction matrix. Then through a detailed experimental study, we show that the frequent itemsets produced by our method represent a significant portion of the set of all frequent itemsets according to existing metrics, while being up to several orders of magnitude less numerous.
Type de document :
Communication dans un congrès
Conference on Data Science and Advanced Analytics (DSAA), 2014, Shanghai, China, China. pp.1-7, 2014
Liste complète des métadonnées

https://hal.archives-ouvertes.fr/hal-01071751
Contributeur : Maria-Irina Nicolae <>
Soumis le : lundi 6 octobre 2014 - 15:57:12
Dernière modification le : mardi 28 octobre 2014 - 18:34:06

Identifiants

  • HAL Id : hal-01071751, version 1

Collections

UGA | LIG

Citation

Hamid Mirisaee, Eric Gaussier, Alexandre Termier. Itemset Approximation Using Constrained Binary Matrix Factorization. Conference on Data Science and Advanced Analytics (DSAA), 2014, Shanghai, China, China. pp.1-7, 2014. <hal-01071751>

Partager

Métriques

Consultations de la notice

149