Efficient Integration of Data Mining Techniques in Database Management Systems - Archive ouverte HAL Accéder directement au contenu
Communication Dans Un Congrès Année : 2004

Efficient Integration of Data Mining Techniques in Database Management Systems

Résumé

In this paper, we propose a new approach for applying data mining techniques, and more particularly supervised machine learning algorithms, to large databases, in acceptable response times. This goal is achieved by integrating these algorithms within a Database Management System. We are thus only limited by disk capacity, and not by available main memory. However, the disk accesses that are necessary to scan the database induce long response times. Hence, we propose an original method to reduce the size of the learning set by building its contingency table. The machine learning algorithms are then adapted to operate on this contingency table. In order to validate our approach, we implemented the ID3 decision tree construction method and showed that using the contingency table helped us obtaining response times equivalent to those of classical, in-memory software.
Fichier principal
Vignette du fichier
80_bentayeb_f.pdf (156.9 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-00321982 , version 1 (16-09-2008)

Licence

Paternité

Identifiants

  • HAL Id : hal-00321982 , version 1

Citer

Fadila Bentayeb, Jérôme Darmont, Cédric Udréa. Efficient Integration of Data Mining Techniques in Database Management Systems. 8th International Database Engineering and Applications Symposium (IDEAS 2004), 2004, Coimbra, Portugal. pp.59-67. ⟨hal-00321982⟩
74 Consultations
174 Téléchargements

Partager

Gmail Facebook X LinkedIn More