Skip to Main content Skip to Navigation
New interface
Conference papers

TopPI: An Efficient Algorithm for Item-Centric Mining

Abstract : We introduce TopPI, a new semantics and algorithm designed to mine long-tailed datasets. For each item, and regardless of its frequency, TopPI finds the k most frequent closed itemsets that item belongs to. For example, in our retail dataset, TopPI finds the itemset " nori seaweed, wasabi, sushi rice, soy sauce " that occurrs in only 133 store receipts out of 290 million. It also finds the itemset " milk, puff pastry " , that appears 152,991 times. Thanks to a dynamic threshold adjustment and an adequate pruning strategy, TopPI efficiently traverses the relevant parts of the search space and can be parallelized on multi-cores. Our experiments on datasets with different characteristics show the high performance of TopPI and its superiority when compared to state-of-the-art mining algorithms. We show experimentally on real datasets that TopPI allows the analyst to explore and discover valuable itemsets.
Complete list of metadata

Cited literature [14 references]  Display  Hide  Download
Contributor : Vincent Leroy Connect in order to contact the contributor
Submitted on : Friday, August 19, 2016 - 11:54:29 AM
Last modification on : Wednesday, July 6, 2022 - 4:21:17 AM
Long-term archiving on: : Sunday, November 20, 2016 - 10:22:44 AM


Files produced by the author(s)



Martin Kirchgessner, Vincent Leroy, Alexandre Termier, Sihem Amer-Yahia, Marie-Christine Rousset. TopPI: An Efficient Algorithm for Item-Centric Mining. 18th International Conference on Big Data Analytics and Knowledge Discovery, Sep 2016, Porto, Portugal. ⟨10.1007/978-3-319-43946-4_2⟩. ⟨hal-01354713⟩



Record views


Files downloads