Itemset Approximation Using Constrained Binary Matrix Factorization - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2014

Itemset Approximation Using Constrained Binary Matrix Factorization

Résumé

We address in this paper the problem of efficiently finding a few number of representative frequent itemsets in transaction matrices. To do so, we propose to rely on matrix decomposition techniques, and more precisely on Constrained Binary Matrix Factorization (CBMF) which decomposes a given binary matrix into the product of two lower dimensional binary matrices, called factors. We first show, under binary constraints, that one can interpret the first factor as a transaction matrix operating on packets of items, whereas the second factor indicates which item belongs to which packet. We then formally prove that one can directly mine the CBMF factors in order to find (approximate) itemsets of a given size and support in the original transaction matrix. Then through a detailed experimental study, we show that the frequent itemsets produced by our method represent a significant portion of the set of all frequent itemsets according to existing metrics, while being up to several orders of magnitude less numerous.
Fichier non déposé

Dates et versions

hal-01071751 , version 1 (06-10-2014)

Identifiants

  • HAL Id : hal-01071751 , version 1

Citer

Hamid Mirisaee, Eric Gaussier, Alexandre Termier. Itemset Approximation Using Constrained Binary Matrix Factorization. Conference on Data Science and Advanced Analytics (DSAA), 2014, Shanghai, China, China. pp.1-7. ⟨hal-01071751⟩
173 Consultations
0 Téléchargements

Partager

Gmail Facebook X LinkedIn More