A new classification of datasets for frequent itemsets

Frédéric Flouvat; Fabien de Marchi; Jean-Marc Petit

Article Dans Une Revue Journal of Intelligent Information Systems Année : 2010

A new classification of datasets for frequent itemsets

(1) , (2) , (2)

1
2

Frédéric Flouvat

Fonction : Auteur
PersonId : 179895
IdHAL : frederic-flouvat
ORCID : 0000-0001-7288-0498
IdRef : 112090508

Laboratoire d'InfoRmatique en Image et Systèmes d'information

Fabien de Marchi

Fonction : Auteur
PersonId : 6947
IdHAL : fabien-de-marchi
IdRef : 078523125

Base de Données

Jean-Marc Petit

Fonction : Auteur
PersonId : 4224
IdHAL : jean-marc-petit
ORCID : 0000-0002-0015-745X

Base de Données

Résumé

The discovery of frequent patterns is a famous problem in data mining. While plenty of algorithms have been proposed during the last decade, only a few contributions have tried to understand the influence of datasets on the algorithms behavior. Being able to explain why certain algorithms are likely to perform very well or very poorly on some datasets is still an open question. In this setting, we describe a thorough experimental study of datasets with respect to frequent itemsets. We study the distribution of frequent itemsets with respect to itemsets size together with the distribution of three concise representations: frequent closed, frequent free and frequent essential itemsets. For each of them, we also study the distribution of their positive and negative borders whenever possible. The main outcome of these experiments is a new classification of datasets invariant w.r.t. minsup variations and robust to explain efficiency of several implementations.

Domaines

Informatique [cs]

Équipe gestionnaire des publications SI LIRIS : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01381427

Soumis le : vendredi 14 octobre 2016-14:45:14

Dernière modification le : mercredi 5 juillet 2023-15:28:04

Dates et versions

hal-01381427 , version 1 (14-10-2016)

Identifiants

HAL Id : hal-01381427 , version 1

Citer

Frédéric Flouvat, Fabien de Marchi, Jean-Marc Petit. A new classification of datasets for frequent itemsets. Journal of Intelligent Information Systems, 2010, 1, 34, pp.1-19. ⟨hal-01381427⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 UNIV-LYON2 INSA-LYON EC-LYON LIRIS LABEXIMU INSA-GROUPE UDL

111 Consultations

0 Téléchargements

A new classification of datasets for frequent itemsets

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager