Itemset Approximation Using Constrained Binary Matrix Factorization

Abstract : We address in this paper the problem of efficiently finding a few number of representative frequent itemsets in transaction matrices. To do so, we propose to rely on matrix decomposition techniques, and more precisely on Constrained Binary Matrix Factorization (CBMF) which decomposes a given binary matrix into the product of two lower dimensional binary matrices, called factors. We first show, under binary constraints, that one can interpret the first factor as a transaction matrix operating on packets of items, whereas the second factor indicates which item belongs to which packet. We then formally prove that one can directly mine the CBMF factors in order to find (approximate) itemsets of a given size and support in the original transaction matrix. Then through a detailed experimental study, we show that the frequent itemsets produced by our method represent a significant portion of the set of all frequent itemsets according to existing metrics, while being up to several orders of magnitude less numerous.
Document type :
Conference papers
Liste complète des métadonnées
Contributor : Maria-Irina Nicolae <>
Submitted on : Monday, October 6, 2014 - 3:57:12 PM
Last modification on : Thursday, October 11, 2018 - 8:48:04 AM


  • HAL Id : hal-01071751, version 1



Hamid Mirisaee, Eric Gaussier, Alexandre Termier. Itemset Approximation Using Constrained Binary Matrix Factorization. Conference on Data Science and Advanced Analytics (DSAA), 2014, Shanghai, China, China. pp.1-7. ⟨hal-01071751⟩



Record views