Mining Optimal Decision Trees from Itemset Lattices - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2007

Mining Optimal Decision Trees from Itemset Lattices

Résumé

We present DL8, an exact algorithm for finding a decision tree that optimizes a ranking function under size, depth, accuracy and leaf constraints. Because the discovery of optimal trees has high theoretical complexity, until now no efforts have been made to compute such trees for real-world datasets. An exact algorithm is of both scientific and practical interest. From a scientific point of view, it can be used as a gold standard to evaluate the performance of heuristic decision tree learners and to gain new insight in these traditional learners. From the application point of view, it can be used to discover trees that cannot be found by heuristic decision tree learners. The key idea behind our algorithm is the relation between constraints on decision trees and constraints on itemsets. We propose to exploit lattices of itemsets, from which we can extract optimal decision trees in linear time. We give several strategies to efficiently build these lattices. Experiments show that under the same constraints, DL8 has better test results than C4.5 which confirm that exhaustive search does not always imply overfitting. The results also show that DL8 is a useful and interesting tool to learn decision trees under constraints.
Fichier principal
Vignette du fichier
kdd07_final.pdf (192.78 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)
Loading...

Dates et versions

hal-00372011 , version 1 (31-03-2009)

Identifiants

Citer

Siegfried Nijssen, Elisa Fromont. Mining Optimal Decision Trees from Itemset Lattices. SIGKDD international conference on Knowledge discovery and data mining, Aug 2007, San Jose, United States. pp.530 - 539, ⟨10.1145/1281192.1281250⟩. ⟨hal-00372011⟩
67 Consultations
589 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More